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ABSTRACT 


A  preliminary  study  was  made  of  the  requirements,  criteria,  and 
measures  of  performance  of  information  storage  and  retrieval  systems. 


a  total  of  92  applied  electronics  researchers  and  11  metallurgists  were 
Interviewed  to  measure  and  rank  several  different  require¬ 

ments  for  information.  It  was  found  that  some  requirements  could  de¬ 
finitely  be  measured,  and  that  there  was  general  disagreement  among  the 
users  about  the  relative  importance  of  various  information  requirements. 
The  methodology  and  the  interview  guide  could  be  extended,  with  minor 
modifications,  to  other  technical  subject  fields.  In  addition^  Uu"  Life 

H „ I  I  "*  I..  three  separate  and  complementary  tools 
were  developed  for  the  analysis  and  evaluation  of  information  retrieval 
systems;  (1)  a  coarse  screening  procedure;  (2)  two  different  performance 
evaluation  procedures;  and  (3)  two  cost  analysis  procedures  that  used 
computer  programs  to  simulate  the  operation  of  candidate  systems  to  de¬ 
termine  their  operating  costs  over  wide  ranges  in  operating  conditions. 

A  general  functional  model  of  a  storage  and  retrieval  system  was  developed 
for  use  by  these  cost  analysis  programs,  if^  number  of  specific  research 
tasks  are  also  suggested  to  further  develoK  the  techniques  for  the  de¬ 
termination  of  user  requirements  and  the  measurement  of  the  performance 
of  information  storage  and  retrieval  systems. 
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REQUIREMENTS,  CRITERIA,  AND  MEASURES  OF  PERFORMANCE 
OF  INFORMATION  STORAGE  AND  RETRIEVAL  SYSTEMS 

I  INTRODUCTION 

Increasing  amounts  of  money  are  being  spent  by  government  and 
commercial  organizations  for  complex  systems  and  equipment  for  the 
partial  mechanization  of  the  operations  of  collection,  storage,  and 
retrieval  of  scientific  Information,  In  addition  to  this  equipment 
cost,  a  large  amount  of  money  is  being  spent  to  support  special  infor¬ 
mation  services  and  centers.  Undoubtedly,  the  main  objective  of  these 
efforts  is  to  increase  the  productivity  of  those  people  who  must  use 
scientific  and  technical  knowledge  to  further  their  work.  The  present 
and  projected  rates  of  generation  of  scientific  knowledge,  and  the 
greater  reliance  of  all  societies  on  progress  through  science,  give 
growing  importance  to  the  making  of  correct  choices  among  proposed  in¬ 
formation  storage  and  retrieval  systems. 

There  are  no  simple  rules  by  which  intelligent  choices  can  be  made 
among  the  many  information  systems  that  are  pressing  for  attention. 

Many  of  these  systems  involve  not  only  large  complexes  of  files  and 
information  specialists,  but  also  extremely  expensive  equipment.  In 
the  face  of  a  whole  array  of  such  intricate  information  systems,  the 
evaluative  techniques  known  to  systems  engineering  and  to  operations 
research  are  hard  pressed  to  select  from  the  competing  alternatives 
those  that  will  most  efficiently  satisfy  the  users  of  scientific  infor¬ 
mation  within  specified  time  and  cost  constraints.  The  problem  is 
aggravated  by  the  consideration  that  the  stakes  involved  in  the  choices 
are  likely  to  increase  with  time.  This  is  because  the  information  re¬ 
trieval  systems  proposed  in  the  future  to  assist  the  scientist  will  be 
apt  to  cost  more  than  present  ones;  however,  in  return  they  will  un¬ 
doubtedly  offer  greater  gains. 

There  is  an  immediate  need  to  make  choices  among  the  present  array 
of  systems  and  machines  for  informatio.;  retrieval.  The  lack  of  sophis¬ 
ticated  techniques  by  which  such  comparisons  can  be  made  calls  for  the 
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rapid  development  of  rough  but  logical  measures-of-worth  for  candidate 
systems.  At  the  same  time,  a  need  exists  for  the  development  of  a 
longer-range  research  effort  aimed  at  improving  the  methodology  for 
comparison  of  information  systems.  Such  research  would  ultimately  re¬ 
sult  also  in  a  better  understanding  of  the  role  of  information  systems 
In  Increasing  scientific  productivity. 

The  work  reported  here  was  directed  primarily  to  the  first  need — 
namely,  the  fairly  rapid  development  of  rough  measures-of-worth  for 
candidate  systems.  Specifically,  the  objectives  were  fivefold: 

(1)  To  develop  a  methodology  for  detenriining  users' 
requirements 

(2)  To  obtain  specific  data  about  the  information  re¬ 
quirements  of  a  particular  community  of  users 

(3)  To  develop  a  preliminary  set  of  criteria  and  a 
procedure  that  can  be  applied  to  existing  infor¬ 
mation  retrieval  systems  in  order  to  reach  tenta¬ 
tive  conclusions  about  the  desirability  of  such 
systems 

(4)  To  develop  specific  measures  of  system  performance 

(5)  To  develop  plans  for  a  research  program  for  the 
longer-range  development  of  more  basic  and  exhaus¬ 
tive  criteria  and  methods  lor  the  assoBsment  of 
alternative  systems  and  procedures. 

Many  useful  user  studies  have  been  conducted  in  the  past,  but  few 
of  thorn  have  been  directly  concerned  with  methods  for  measuring  require 
ments .  For  example,  several  studios  deteimilned  the  type  of  journals 
that  were  read,  the  places  at  which  reading  was  done,  and  the  complaint 
that  users  had  about  present  library  service.  The  present  study  has 
been  successful,  to  a  limited  degree,  in  developing  an  interim  methodo¬ 
logy  by  which  some  of  the  requirements  of  the  users  can  be  measured  and 
described  in  quantitative  terms,  for  nearly  any  technical  field  that 
requires  continued  reference  to  technical  literature. 
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Engineers  and  other  scientific  and  technical  workcis  have  require¬ 
ments  for  many  different  types  of  information  such  as:  (1)  current 
awareness;  (2)  specific  information  to  help  with  current  project  work; 

(3)  exhaustive  searches  that  are  usually  performed  as  a  separate  pro¬ 
ject,  or  as  a  prelude  to  the  major  effort  of  a  project.  This  study 
restricted  its  attention  to  the  second  and  third  types,  while  consider¬ 
ing  the  requirements  for  formal  technical  literature  (e.g.,  books, 
journal  articles,  report  literature,  and  conference  proceedings)  and 
the  types  of  information  request  that  would  likely  be  directed  to  a 
national  library  or  special  information  center  for  a  particular  subject 
field.  The  evaluation  procedures  were  developed  to  assess  the  degree 
to  which  storage  and  retrieval  systems  satisfied  these  types  of  re¬ 
quirements.  These  procedures  are  preliminary,  and  need  improvement. 

In  addition  to  the  improvement  of  evaluation  procedures,  a  great 
deal  of  work  still  remains  to  be  done  to  find  ways  in  which  the  users' 
needs  for  information  can  be  determined  accurately.  The  users'  require¬ 
ments  must  be  described  in  greater  detail  before  any  evaluation  proce¬ 
dures  are  implemented.  If  they  are  not,  then  the  evaluation  procedures 
have  little  significance. 

A  discussion  of  the  methods  for  measuring  the  user  requirements, 
and  the  results  obtained  from  a  sample  survey  of  a  specific  population 
of  users  is  given  in  Secs.  Ill  and  IV  on  survey  methodology  and  survey 
results.  Section  V  describes  a  generalized  functional  model  of  a  storage 
and  retrieval  system.  Section  VI  describes  the  criteria,  measures  of 
performance,  and  analysis  techniques  that  were  developed,  and  evaluates 
three  representative  retrieval  systems  using  some  of  these  techniques. 
Finally,  Sec.  VII  provides  some  suggestions  for  future  research  work  to 
extend  and  improve  the  results  that  have  been  achieved  to  date,  A 
sample  of  the  interview  guide  used  in  the  survey,  two  computer  programs, 
and  additional  supporting  data  are  included  in  the  Apoendices. 
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SUMMARY  AND  CONCLUSIONS 


A»fter  interviewing  over  90  researchers  in  electronics  and  11 
metallurgists^  it  was  found  that^  to  a  limited  degree,  information 
requirements  could  be  measured  quantitatively,  and  measures  could  be 
formulated  of  the  relative  importance  of  each  of  these  requirements. 

The  interviews  did  provide  a  composite  or  over-all  agreement  on  the 
relative  importance  of  seven  different  factors;  the  most  important  of 
which  was  agreed  to  be  the  response  time,  the  time  involved  between  the 
request  and  the  receipt  of  the  major  group  of  relevant  references .  How¬ 
ever,  there  was  no  strong  agreement  as  to  the  relative  importance  of  the 
other  six  factors.  In  addition  to  the  difficulty  of  obtaining  true 
rankings,  it  is  also  extremely  difficult  to  measure  some  of  the  re¬ 
quirements  accurately  and  quantitatively.  Some  useful  results  were 
obtained  with  the  direct  interview  approach  used  here,  although  with 
this  and  many  other  alternative  approaches,  it  is  very  difficult  to 
avoid  a  conditioned  response.  The  statements  by  a  user  reflect  the 
type  of  information  service  that  he  is  accustomed  to  getting,  so  that 
the  study  can  never  really  separate  need  from  habit.  The  critical- 
incidents  approach  used  here  did  not  provide  as  clear-cut  results  as 
had  been  anticipated. 

Three  separate  and  complementary  analysis  procedures  were  developed 
which  give  indications  of  being  useful  tools  for  the  evaluation  of 
storage  and  retrieval  systems.  The  first  tool,  a  coarse  screening 
procedure,  arranges  empirical  data  to  show  the  ranges  of  parameter 
values  that  are  likely  to  be  encountered  by  candidate  systems.  This 
tool  could  be  used  immediately;  it  can  also  be  refined  to  make  it  even 
more  useful.  The  second  tool,  a  performance  evaluation  procedure,  re¬ 
lates  system  performance  to  user  requirements — while  considering  the 
relative  importance  of  each  of  these  requirements — to  arrive  at  a  single 
figure  of  merit  or  performance  figure  for  each  candidate  system  applied 
to  each  user  population  of  interest.  The  second  tool  can  be  implemented 
in  two  different  ways;  (1)  direct  quantitative  measurement  and  correla¬ 
tion  of  the  performance  and  requirements,  with  quantitative  weighting 
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for  the  relative  importance  of  each  requirement,  or  (2)  the  reduction 
of  all  the  requirements  and  performance  to  a  common  denominator  of  time 
or  cost.  The  first  way  has  some  limitations  but  could  be  implemented 
in  the  near  future  if  the  quantitative  data  describing  the  requirements 
and  performance  were  available.  The  second  way  seems  to  be  a  more 
accurate  approach  but  needs  further  development  before  it  can  be  used. 

The  third  tool,  two  cost  analysis  procedures  and  programs  used  a  com¬ 
puter  and  some  modelling  programs  to  simulate  the  operation  of  specific 
storage  and  retrieval  systems  using  basic  data  on  time,  cost,  and 
equipment  capabilities,  to  arrive  at  estimates  of  the  total  operating 
costs  of  a  candidate  system  over  wide  ranges  in  operating  parameters 
such  as  file  size,  accession  rate,  and  volimie  of  search  requests.  The 
cost  analysis  procedures  utilized  a  general  functional  model  of  a  storage 
and  retrieval  system  developed  during  this  study.  Both  cost  analysis 
procedures  were  successfully  applied  to  three  representative  systems; 
the  results  suggest  that,  given  the  basic  descriptive  information,  the 
two  programs  could  be  usefully  employed  right  away  for  the  analysis  of 
specific  candidate  systems.  The  computer  programs  were  written  in 
ALGOL,  a  universal  programming  language,  so  that  they  can  be  used  by 
any  other  Interested  group. 

The  work  to  date  on  this  six-month  study  represents  a  very  pre¬ 
liminary  effort  to  obtain  solutions  to  an  extremely  difficult  pi'oblem. 
Continued  studies  are  neeessa'i'y  to  achieve  more  accurate  and  useful 
evaluation  procedures  and  measures  of  performance.  It  is  felt  that  the 
following  problem  areas  would  be  good  targets  for  Immediate  and  long- 
range  research: 

(1)  Development  of  methf'dology  for  determining  user 
requirements 

(2)  Determination  of  elemental  times  and  costs  of  the 
basic  operations  performed  in  storage  and  retrieval 
systems 

(3)  Development  and  use  of  modelling  for  performance 
evaluation 


5 


(4)  Development  and  use  of  modelling  for  analysis  of 
operating  costs 

(5)  Pilot  tests  or  pils.t  evaluations  of  representative 
systems 

(6)  Additional  basic  studies. 
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Ill  A  METHODOLOGY  FOR  MEASURING  USERS'  INFORMATION  REQUIREMENTS 
A .  General  Methods 

A  number  of  different  approaches  can  be  taken  to  determine  the 
information  requirements  of  the  user  of  a  retrieval  system.  Generally, 
the  approaches  might  be  characterized  as  follows:  (1)  study  of  the 
user's  information  environment;  (2)  study  of  the  present  information 
resources  (a  special  part  of  the  information  environment) ;  (3)  study 
of  the  user.  Methods  appropriate  to  each  of  these  approaches  are  dis¬ 
cussed  below. 

1 .  Study  of  the  User's  Information  Environment 

This  approach  examines  some  of  the  economic  and  time  pressures 
or  practical  constraints  present  in  the  user's  environment  that  limit 
the  information  resources  the  Individual  can  utilize.  These  constraints 
are  not  likely  to  change  very  significantly  no  matter  how  many  new  and 
improved  information  retrieval  systems  are  provided;  consequently,  an 
understanding  of  the  constraints  is  of  great  Importance.  These  con¬ 
straints  might  be  explored  with  questions  such  as  these: 

(1)  How  much  do  organizations  spend  now  for  infor¬ 
mation  services — and  how  much  do  they  feel  they 
can  afford? 

(2)  What  total  volume  of  literature  is  currently  made 
available  to  the  user  in  his  own  organization? 

This  reflects  the  organization's  scope  of  interest, 
and  its  budget  for  information  services. 

(3)  What  total  volume  of  literature  is  of  frequent 
personal  interest  to  the  worker?  This  represents 
the  parameters  of  the  file  which  satisfies  a  good 
fraction  of  the  information  needs  of  the  individual 
worker. 

For  example,  regardless  of  the  type  of  information  or  services  available, 
an  individual  or  organization  still  has  a  limited  amount  of  time  or 
money  Lu  spend  for  Information. 
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(4)  What  is  the  amount  of  time  that  a  worker  can 
afford  (hecause  of  cost  or  other  pressures) 
to  spend  in  reviewing  or  searching  the  lit¬ 
erature? 

2 .  Study  of  the  Present  Information  Resources 

The  quality  of  service  of  the  user's  present  information  re¬ 
sources  provides  a  lower  bound  for  the  requirements  of  any  proposed 
alternative  system.  That  is,  any  new  retrieval  system  should  provide 
at  least  as  much  service  and  value  as  the  system  it  is  to  replace. 

Since  the  present  habits  and  actions  of  the  user  reflect,  to  an  unknown 
degree,  his  nepds  and  requirements,  we  might  consider  the  following 
questions : 

(1)  How  are  libraries  and  information  services 
actually  used  (functions,  type  of  material, 
type  of  user,  type  of  questions)? 

(2)  What  are  the  operating  statistics  of  present 
systems  (volume  of  questions,  number  of  users, 
budgets,  staffing,  file  size,  input  rate)? 

3 .  Study  of  the  User 

Unfortunately,  information  about  the  user  is  extremely  diffi¬ 
cult  to  obtain.  Measurements  are  difficult,  if  not  impossible,  and 
most  studies  resort  to  judgements  or  opinions.  The  user  himself  is 
frequently  a  poor  source  for  direct  comment  on  his  needs;  he  is  usually 
influenced  by  the  tools  and  facilities  that  he  is  familiar  with,  and  he 
usually  cannot  discriminate  between  his  actual  needs  and  his  way  of 
performing  wuik.  Any  of  the  following  methods,  or  combinations  of  them, 
might  be  used  to  obtain  Itiformation  about  the  user's  requirements: 

(1)  Ask  the  users  specific  questions  about  what 
they  think  their  requirements  are  (e.g., 
tolerable  delay,  form  of  resulting  product, 
types  of  service  preferred) . 


8 


(2)  Ana'vze  recent  information  requestB,,  Probe 
t''  ircumstances  that  motivated  the  request 
for  information.  Determine  the  parameters — 
such  as  response  and  error  rates — that  would 
have  been  tolerable  in  a  particular  situation. 

Find  out  the  nature  of  any  disappointments  or 
unsatisfactory  results.  Taking  advantage  of 
the  user's  hindsight,  find  out  what  he  would 
like  to  have  obtained  in  the  way  of  specific 
products  or  services, 

(3)  Monitor  the  establishment  and  fulfillment  of  a 
research  project  or  experiment,  and  note  the 
specific  needs  and  requirements  as  they  occur. 
Although  realistic  data  may  be  obtained  in  this 
way,  the  method  has  the  disadvantages  of  inter- 
f erring  with  the  working  group,  requiring  a 
relatively  long  lag  time  for  completion  of  the 
data  gathering  through  a  complete  project  schedule, 
and  probably  requiring  a  relatively  large  amount 

of  observer's  time  for  a  number  of  different  pro¬ 
jects  in  order  to  obtain  statistically  significant 
data . 

(4)  Postulate  a  "perfect"  retrieval  system;  then 
allow  people  to  pose  questions  to  the  system. 

(5)  Determine  the  functions  (e.g.,  preparation  to 
learn  new  techniques,  to  learn  experimental 
results,  to  plan  new  research,  to  prepare  lec¬ 
tures,  to  keep  abreast,  etc.)  of  the  various 
portions  of  the  information  services  and  find 
out  how  well  each  of  these  functions  is  being 
met.  The  dual  of  this  method  is  to  examine 
the  various  portions  or  channels  of  the  infor¬ 
mation  system  (e.g.,  abstracts,  books,  journals, 
advertisements,  etc.)  and  find  out  the  functions 
that  each  of  these  channels  serve. 
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(6)  Measure  the  result  ^hat  a  user  usually  obtains 
(by  performing  his  regular  type  of  search)  and 
compare  it  to  the  result  that  can  be  achieved 
by  an  exhaustive  search  of  all  available  re¬ 
sources.  This  would  give  some  indication  of 
the  amount  of  overlooked  material  he  could 
tolerate . 

(7)  Perform  a  controlled  experiment  in  which  identical 
or  comparable  tasks  are  performed  by  groups  with 
different  inf  -mat ion  resources.  This  would  give 
some  indication  of  the  relationship  between  user 
productivity  and  the  availability  of  information. 

(0)  Record,  in  some  uniform  measure,  the  amount  nf 
Information  that  is  normally  available  to  the 
Individual  in  his  own  office.  This  would  give 
an  estimate  of  the  scope  of  interest  or  range 
of  coverage  of  the  individual  user,  and  would 
show  how  large  a  file  of  information  he  considers 
sufficiently  important  to  warrant  the  expenditure 
of  his  own  time  and  money. 

(9)  Determine  the  circumstances  surrounding  the 
critical  requirements  for  information.  (That 
is,  those  requests  for  information  that  are 
critical  or  fundamental  to  the  solution  of  a 
given  technical  problem.) 

This  pro.iect  asked  the  user  specific  questions  [see  (1)  and 
(9) ]  with  the  aid  of  the  survey  techniques  described  below. 

B .  Description  of  the  Survey  Technique 

A  survey  technique,  using  personal  interviews  among  a  specific  user 
population,  was  selected  for  determining  user  requirements  in  this  study. 
A  preliminary  interview  guide,  incorporating  the  so-called  critical- 
incident  appx’oach  as  well  as  direct  questions,  was  developed  after  some 


10 


intensive  interviews  and  alter  discussions  among  members  of  the  project 
team.  The  preliminary  guide  was  pre-tested  among  nine  electrical  en¬ 
gineers  on  the  Institute  staff. 

The  final  interview  guide  was  designed  to  obtain  four  kinds  of 
information: 

(1)  A  list  of  critical  requirements,  using  the  critical 

1^^ 

incident  technique 

(2)  Measurements  of  selected  requirements  that  were  con¬ 
sidered  both  important  and  susceptible  to  measurement 
(Some  requirements  known  to  be  Important  were  un¬ 
avoidably  omitted  because  of  the  preliminary  nature 
of  this  project.) 

(3)  Rank  order  of  the  importance  of  seven  factors  that 
were  believed  to  be  important  to  users  and  were 
amenable  to  ranking 

(4)  Background  variables  that  might  influence  the  user 
needs  (company,  age,  academic  degree,  specialty  field, 
type  of  search,  and  the  like) . 

The  focus  of  the  interview  was  on  the  most  recent  search  conducted  by 
the  individual.  Two  of  the  94  individuals  contacted  had  not  conducted 
a  search  in  the  past  year  and  were  not  interviewed. 

The  approach  of  limiting  the  interview  to  the  most  recent  search 
(and  consequently  reflecting  the  performance  of  the  present  system 
available  to  the  individual)  was  considered  at  length  by  the  project 


The  interview  guide  was  simply  a  guide  and  recording  form  for  the 
interviewer.  It  was  not  a  questionnaire,  and  it  was  not  meant  to  be 
read  or  closely  examined  by  the  test  subjects. 


The  critical  incident  technique  is  a  method  for  identifying  require¬ 
ments  that  are  of  particular  importance  to  the  success  of  a  task — in 
this  instance,  a  literature  search.  This  is  described  more  fully  in 
Section  III.  For  a  more  detailed  description  of  the  critical  inci¬ 
dent  technique,  see  Ref.  1.  (All  references  are  listed  at  the  end  of 
the  report.) 
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team.  There  were  three  major  arguments  for  this  approach; 

(1)  That  respondents  could  talk  realistically  about  the 
present  system 

(2)  That  their  needs  remain  constant  regardless  of  the 
system  available 

(3)  That  any  contemplated  new  system  would  have  to  be 
equal  or  superior  to  the  present  system. 

There  were  two  major  arguments  against  the  approach: 

(1)  That  a  new  system  might  offer  such  vast  improvements 
that  answers  concerning  the  present  system  would  not 
be  meaningful 

(2)  That  the  users'  statements  of  needs  are  definitely 
conditioned  by  the  service  they  are  presently  accus¬ 
tomed  to. 

The  possibility  of  asking  respondents  to  answer  in  terms  of  an 
"ideal"  system  was  considered.  This  v/as  rejected  because  it  was  be¬ 
lieved  that  answers  might  be  given  that  are  unrealistic  in  terms  of 
present  capabilities  (e.g.j  ”l  want  100%  of  the  world's  relevant  material 
and  no  irrelevant  material  within  one  hour  of  my  request,’’). 

Giving  the  respondents  a  choice  of  various  system  capabilities  was 
also  considered.  For  example,  respondents  could  have  been  asked  to 
choose  between  many  pairs  of  systems,  such  as  the  following: 

(1)  A  system  that  in  24  hours  produced  docinnents  of  which 
50  percent  were  Irrelevant,  versus  a  system  that  in 
one  week  provided  only  the  relevant  documents; 

(2)  A  system  that  produced  all  the  relevant  documents 
but  many  irrelevant  documents,  versus  a  system  that 
produced  few  irrelevant  documents  but  might  miss  a 
few  relevant  documents. 

This  technique  was  rejected  because  the  number  of  variables,  and  con¬ 
sequently  the  number  of  alternatives  that  would  have  to  be  presented 
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1.0  the  respondent,  was  too  great.  It  was  difficult  to  imagine  that  many 
respondents  would  be  willing  to  take  the  time  and  effort  to  make  all  the 
choices  from  the  pairs  of  alternatives. 

The  interview  in  its  final  form  took  about  45  minutes  per  individual . 
General  interest  in  the  subject  was  high,  and  the  cooperation  of  res¬ 
pondents  was  excellent. 

C .  Description  of  the  Sample  Population 

Test  subjects  were  chosen  from  persons  doing  applied  research  in 
the  field  of  electronics.  Eleven  metallurgists  were  added  later.  For 
the  main  purposes  of  the  project,  the  choice  of  population  was  not  cri¬ 
tical;  this  is  because  our  prime  interest  lies  in  developing  the  methods 
of  measurement,  and  in  determining  which  requirements  can  be  described 
analytically,  and  which  requirements  must  receive  a  Judgmental  descrip¬ 
tion.  In  order  that  results  can  be  validly  compared  with  the  results 
of  other  surveys,  it  is  important  to  describe  the  population  accurately; 
details  of  the  measurement  of  this  particular  population  may  be  useful 
for  other  purposes  also. 

The  exploratory  nature  and  scope  of  the  study  did  not  permit  a  pre¬ 
cise  sample  of  a  known  population.  Stanford  Research  Institute  and 
three  California  Industrial  firms  each  provided  approximately  equal 
numbers  of  test  subjects. 

A  sample  of  persons  engaged  in  many  fields  of  applied  electronics 
research  was  selected  in  each  firm,  with  a  total  of  92  persons  receiving 
personal  interviews  that  generally  lasted  about  45  minutes.  The  great 
majority  of  subjects  held  academic  degrees  in  electrical  engineering. 

A  few  held  a  degree  in  another  field  (primarily  physics)  .  An  attempt 
was  made  to  obtain  a  greater  number  of  workers  with  higher  academic 
degrees  and  in  higher  Job  levels  than  would  be  obtained  with  a  random 


IBM  Laboratories,  San  Jose;  Lockheed  Missiles  and  Space  Co.,  Palo  Alto; 
Sylvania  Laboratories,  Mountain  View. 
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sample,  so  that  the  results  could  he  examined  according  to  these  variables. 
Detailed  tables  of  the  characteristics  of  the  sample  population  may  be 
found  in  Appendix  F. 

In  addition  to  interviews  with  electronics  researchers,  interviews 
were  conducted  with  11  metallurgists.  One  was  interviewed  at  Sylvania, 
and  a  sample  of  ten  were  interviewed  at  Lockheed. 

The  anal.ysis  and  summary  of  the  interview  responses  of  the  elec¬ 
trical  engineers,  are  given  in  Sec.  IV. 

D.  Initial  List  of  Requirements 

In  order  to  define  and  describe  the  information  requirements  that 
were  to  be  selected  for  measurement,  the  project  team  Initially  developed 
a  list  of  many  parameters  that  were  felt  to  be  important.  A  large  amount 
of  published  material  was  revltv/ed  to  uncover  additional  parameters,  and 
discussions  were  held  with  a  number  of  inforjued  individuals  outside  SRI. 

The  resulting  list  of  requirements  was  rather  large,  and  was  subsequently 
reduced  to  a  more  manageable  group  of  about  40  requirements  which  seemed 
to  fall  naturally  into  five  different  categories.  These  are  described 
below. 

1 .  General  Requirements  for  All  Alternative  Systems 

General  requirements  are  those  that  are  common  to  all  candidate 
systems  and  can  be  satisfied  in  the  same  way  and  with  the  same  costs  and 
results  for  each  alternative  system.  Consequently,  they  do  not  contri¬ 
bute  to  a  comparison  of  the  differences  between  the  candidates,  and 
should  be  separated  from  the  rest  of  the  requirements.  For  example, 
there  is  a  requirement  that  each  file  be  as  complete  as  possible  in  the 
subject  fields  of  interest  to  the  users — for  the  user  that  is  choosing 
between  alternative  ways  to  implement  his  file,  this  is  an  acquisition 
problem  common  to  all  the  alternatives  under  consideration.  These 
general  requirements  must  be  considered  in  the  over-all  evaluation  of  a 
system,  but  are  not  considered  in  the  detailed  analysts  and  comparison 
of  specific  systems.  The  following  are  examples  of  such  general  re¬ 
quirements  : 
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(1)  Acquisition  of  high-value,  timely,  technically 
excellent  file  material 

(2)  Provision  for  translations  of  foreign  language 
material 

(3)  Provision  for  throw-away  copies  of  requested 
file  items, 

2 .  Search  Product  Requirements 

The  following  requirements  are  concerned  with  the  actual  search 
product  given  to  the  requestor; 

<1)  Specified  format  of  search  product  (document 
number,  reference  or  citation,  abstract,  re¬ 
print) 

(2)  Specified  physical  form  of  search  product 
(microfilm,  paper,  etc.) 

(3)  Specified  quality  of  printing 

(4)  Reliable  indexing  and  search  products  (i.e., 
assurance  that  you  always  get  what  you  ask  for) , 

3 .  File  Material  Requirements 

The  following  requirements  are  concerned  with  the  material  in 

the  file; 

(1)  Need  for  a  certain  type  of  information  to  be 
included  in  the  file  (technical  papers,  books, 
patents,  reviews,  etc.) 

(2)  Capability  for  accepting  information  written  in 
the  important  foreign  languages 

(3)  Capability  for  storing  graphic  material  (equations, 
diagrams,  chemical  structures,  etc.) 

(4)  Capability  for  storing  a  certain  volume  or  quantity 
of  information 
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(5)  Compatibility  with  other  information  and 
communication  systems 

(6)  Protection  against  loss  of  stored  information 
(e.g.,  protection  of  information  on  magnetic 
tape)  . 

+ 

4 .  User  Requirements 

The  following  considerations  relate  to  the  actual  "over-the- 
counter"  services  given  to  the  user  by  the  information  services  staff 
and  are  of  direct  interest  to  the  user  of  the  information  services: 

(1)  Amount  of  relevant  material  overlooked  during 
the  search 

(2)  Amount  of  irrelevant  material  provided 

(3)  Delay  in  getting  the  first,  final  and  major 
group  of  relevant  references 

(4)  Ease  of  communication  between  the  system  and 
user  (codes,  languages,  media) 

(5)  Complexity  of  search  logic  that  can  be 
accommodated 

(6)  Completeness  of  coverage  (core  and  fringe 
material,  recent  and  past  literature) 

(7)  Provision  for  alternative  mode  of  operation 
(e.g.,  manual)  if  one  or  more  of  the  system 
parts  become  inoperative 

(0)  Indications  of  the  technical  competence  of 
each  search  prodvict 


A  distinction  is  made  in  this  section  between  the  "users"  who  come  to 
the  system  seeking  service,  and  the  "operators"  who  operate  and  main¬ 
tain  the  system.  The  "operators"  in  many  cases  are  the  only  ones  that 
actually  use  the  system — in  the  sense  that  they  operate  the  equipment 
and  search  the  files. 
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(9)  Immediate  and  continuous  availability  for 
searching  or  file  browsing  directly  by  the 
user,  with  a  minimum  of  effort  on  his  part 

(10)  Ability  to  control  and  handle  language  prob¬ 
lems  with  minimum  inconvenience  to  user 
(synonyms,  Jargon). 

5 .  System  Management  Requirements 

The  following  requirements  are  concerned  primarily  with  the 
behind-the-scenes  operation  of  the  information  service,  and  are  of  most 
interest  to  the  organization  that  is  providing  and  operating  the  service 

(1)  Provision  for  easy  re-indexing,  purging,  file 
maintenance;  and  the  capability  to  provide  a 
duplicate  of  the  classification  and  indexing 
information 

(2)  Minimum  need  for  space,  power,  and  special 
installation  or  operating  facilities 

(3)  Minimum  need  for  training,  retraining,  or 
specialization  of  system  personnel 

(4)  Growth  capability  (file  size,  subject  diversity, 
volume  of  searches,  etc.) 

(5)  Self-analysis  to  recover  misfiled  information, 
note  missing  information,  obtain  operating  sta¬ 
tistics  on  system  use  and  performance,  generate 
indexes  or  catalogs,  and  provide  information 
for  management  and  system  control 

(6)  Costs  (equipment  purchase  or  rental,  maintenance, 
spare  parts,  parallel  testing,  conversion.  Initial 
development  and  programming,  indexing,  reproduction, 
storage,  training,  staff,  etc.) 
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(7)  Ability  to  ..  oordintite  the  system  with  similar 
services  in  the  same  or  alien  subject  fields 

(8)  Ability  to  conduct  a  specified  number  of 
searches  within  a  given  time  period. 

The  type  of  user  interviewed  in  this  study  is  generally  not  qualified 
to  comment  on  these  behind-the-scenes  requirements.  Library  managers 
would  be  better  qualified;  however,  none  were  contacted  on  this  pro¬ 
ject  because  our  attention  was  concentrated  on  the  study  of  the  re¬ 
quirements  of  the  ultimate  customer  of  such  an  information  service. 

Because  of  practical  restrictions  on  time,  money,  and  the 
patience  of  the  test  subjects,  measurement  of  every  one  of  these  re¬ 
quirements  could  not  be  attempted.  Consequently,  those  requirements 
that  were  felt  to  be  most  important,  and  had  some  promise  of  being 
measurable,  were  selected  for  detailed  study.  It  was  felt  initially 
that  the  following  factors  were  most  important; 

(1)  Type  and  form  of  search  product  (document 
number,  reference  or  citation,  abstract, 
reprint;  on  paper,  on  film,  etc.) 

(2)  Reliability  of  the  indexing  and  search 
product  (l.e.,  credibility  of  the  product 
and  the  knowledge  that  one  always  gets  an 
accurate  search  product) 

(3)  File  capacity 

(4)  Delay  in  entering  new  information  into  the 
system 

(5)  Automatic  removal  of  obsolete  or  redundant 
material 

(6)  Amount  of  relevant  material  overlooked  during 
the  search 
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(7)  Amount  of  irrelevant  or  redundant  material 
provided  with  the  search  result 

(0)  Immediate  and  continuous  system  availability 
for  searching  or  file  browsing  directly  by 
the  user 

(9)  Delay  in  getting  the  first,  final,  and  major 
group  of  relevant  references 

(10)  Total  number  of  searches  that  can  be  handled 
in  a  given  time  period 

(11)  Ease  of  communication  between  system  and  user 
(codes,  languages,  media) 

(12)  Provision  for  alternative  mode  of  operation 
(e,g.,  manual)  if  one  or  more  of  the  system 
parts  becomes  inoperative - 

The  following  three  items  are  important,  but  the  user  is 
generally  not  qualified  to  comment  on  them; 

(1)  Cost 

(2)  Capability  for  easy  re-indexing,  purging, 
correction,  and  file  maintenance 

(3)  Capability  for  self-analysis  to  recover 
misfiled  information,  note  missing  infor¬ 
mation,  obtain  system  operating  and  perfor¬ 
mance  figures,  and  generate  Indexes  or 
catalogs . 

E .  Requirements  That  Can  Be  Measured 

The  measurements  that  were  made  are  crude,  and  often  consist  of  only 
a  few  data  points.  However,  the  measurement  techniques  can  be  refined 
to  obtain  greater  accuracy  and  more  resolution.  At  this  point,  it  seems 
certain  that  for  a  given  user  population  the  following  group  of  require¬ 
ments  can  be  quantitatively  measured,  and  that  we  can  have  at  least  some 
confidence  in  the  results  that  are  obtained: 
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(1)  Desired,  actual,  and  least  tolerable  delay  in 
obtaining  the  first,  final,  and  major  group  of 
search  products 

(2)  Desired,  actual,  and  least  tolerable  currency 
or  minimum  age  of  the  file  contents 

(3)  Desired,  actual,  and  least  tolerable  format  of 
search  product  (abstract,  citation,  etc.) 

(4)  Desired,  actual,  and  least  tolerable  physical 
form  of  search  product  (paper,  microfilm,  etc.) 

(5)  Desired,  actual,  and  least  tolerable  amount  of 
irrelevant  material  furnished 

(S)  Size  of  the  file  required  to  satisfy  various 
search  needs 

(7)  Tolerable  expenditures  of  effort  to  obtain 
more  current  Information 

(0)  Tolerable  delay  for  various  fractions  of  the 
total  amount  of  relevant  information. 

It  also  seems  certain  that  the  relative  rankings  of  a  given  set  of  re¬ 
quirements  can  be  determined  without  too  much  difficulty.  Methods  for 
determining  the  rankings  and  ascertaining  their  confidence  levels  are 
described  in  a  subsequent  section. 

There  were  some  relatively  important  requirements  for  which  measure¬ 
ments  were  not  made: 

(1)  Tolerable  fraction  of  relevant  material  that  can 
be  overlooked 

(2)  Tolerable  amount  of  effort  required  by  the  user  to 
communicate  with  the  system. 

For  a  number  of  reasons,  both  of  these  requirements  are  extremely 
difficult  to  measure,  and  no  method  was  found  that  could  be  applied  on 
this  short  study.  Several  aspects  of  the  question  of  overlooked  relevant 
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maiorlal  have  been  studied  recently  by  a  number  of  people,  but  their 
efforts  have  been  concentrated  primarily  on  instrumentation  or  methodo- 

2--e 

logy,  and  they  have  not  obtained  specific  measurements. 

In  addition  to  obtaining  some  specific  measurements  of  the  require¬ 
ments,  some  background  material  was  also  obtained  (see  Sec.  IV  and 
Appendix  F)  to  describe  the  circumstances  surrounding  the  requirements, 
such  as:  What  types  of  work  activities  generate  the  search  requests? 

Who  actually  conducts  the  searches?  What  search  facilities  were  used? 

F .  Suggestions  for  Improvement  of  Survey  Methodology 

In  view  of  the  exploratory  nature  of  this  study,  it  is  obvious  that 
some  improvements  in  the  interview  guide  can  be  suggested.  The  following 
suggestions  refer  only  to  changes  in  the  interview  guide  (see  Appendix  F) ; 
suggestions  for  additional  research  are  covered  in  Sec,  VII, 

(1)  There  was  some  confusion  about  the  term  "search,”  in 
spite  of  the  definition  given  respondents.  A  search 
may  consist  of  two  separate  operations:  looking  for 
references,  and  obtaining  the  documents.  Considera¬ 
tion  might  bo  given  to  conducting  the  interviews 
separately  for  each  of  these  two  processes,  parti¬ 
cularly  where  existing  manual  systems  tend  to  divide 
the  two  into  separate  tasks, 

(2)  The  critical-incident  technique  could  perhaps  be 
refined  to  elicit  better  responses  and  ones  that 
were  more  system-oriented.  A  number  of  comments 
referred  to  requirements  that  no  system  could  be 
expected  to  meet  (e.g.,  "not  enough  written” 

"subject  too  current”). 

(3)  Some  of  the  questions  and  answer  categories  could 
he  refined.  In  particular,  if  a  larger  population 
is  studied,  the  time  categories  could  be  increased 

in  number  so  that  a  smaller  period  of  time  is  covered 
by  each  category. 
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(4)  The  procedure  and  wording  for  rank  ordering  of 
selected  requirements  should  bo  reviewed.  First, 
the  wording  of  the  instructions  could  perhaps  be 
shortened  and  made  clearer.  If  possible,  the  degree 
to  which  the  requirements  are  in  conflict  should 

be  explained.  Second,  the  wording  of  the  require¬ 
ments  could  be  improved.  Third,  some  additional 
requirements  could  be  included. 

(5)  The  items  concerning  time  or  effort  spent  vs, 
completeness  of  the  search  are  now  of  questionable 
value  and  can  probably  be  dropped.  These  items 
were  admittedly  experimental.  While  respondents 
answered  as  best  they  could,  it  is  doubtful  that 
they  can  realistically  provide  precise  data. 
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IV  SURVEY  RESULTS 


The  results  of  the  survey  are  discussed  in  detail  in  Appendix  F. 

The  purpose  here  is  to  give  an  over-all  view  of  the  needs  of  the  indivi¬ 
duals  interviewed  for  this  study.  For  this  purpose,  the  survey  results 
will  be  reviewed  briefly.  All  data  refer  to  the  sample  of  electrical 
engineers,  except  for  a  short  section  at  the  end  dealing  with  metallur¬ 
gists  . 

A .  Frequency  and  Types  of  Searches 

As  stated  earlier,  92  of  the  94  electrical  engineers  contacted  had 
conducted  or  requested  at  least  one  search  in  the  last  year.  The  number 
of  searches  per  individual  varied  widely.  Responses  were  about  equally 
distributed  among  the  following  categories:  1  or  2  searches  in  the  past 
year,  3  to  5,  6  to  10,  and  11  or  more  (see  Question  2  of  the  Interview 
Guide  in  Appendix  F)  . 

The  work  activities  that  generate  the  most  searches  are  not 
necessarily  those  in  which  the  most  working  time  is  spent.  ".Search 
for  novel  technical  ideas,"  "preparation  of  lectures  or  technical 
papers,"  and  "keeping  current  with  technical  advances"  were  mentioned 
by  0  percent,  2  percent,  and  1  percent,  respectively,  as  the  one  activity 
-in  which  the  most  working  time  was  spent,  These  same  activities,  how¬ 
ever,  accounted  for  20  percent,  12  percent,  and  11  percent,  respectively, 
of  the  most  recent  searches  reported  by  respondents.  An  exception  was 
design  of  equipment,  systems,  and  procedures.  Almost  half  the  respon¬ 
dents  Indicated  that  this  was  the  one  activity  in  which  they  spend  the 
most  working  time,  and  30  percent  said  their  most  recent  search  con¬ 
cerned  this  activity  (Questions  3a  and  3b,  Appendix  F) . 

Greater  importance  was  attributed  to  the  search  when  it  was  initiated 
than  to  the  results  of  the  search.  Of  the  respondents,  70  percent  rated 
the  search  important  when  it  was  started  but  54  percent  said  that  the 
results  liad  made  little  difference  to  them  when  the  search  was  completed. 
These  responses  may  have  occurred  because  the  answer  categories  to  the 
two  relevant  questions  were  not  identical  (Questions  9  and  10,  Appendix  F) . 
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B. 


Critical  R'jqviirements 


Some  exploratory  questions  were  asked  using  a  technique  modeled  after 
the  critical-incident  technique,  mentioned  in  Sec.  III-B.  The  purpose 
of  these  questions  was  twofold.  First,  they  were  intended  to  determine 
whether  or  not  there  were  a  few  ’’critical"  requirements — that  is,  a  few 
outstandingly  important  criteria.  The  second  purpose  was  to  provide 
some  indication  as  to  whether  the  list  of  requirements  selected  for 
measurement  in  the  study  excluded  some  important  ones. 

Respondents  were  asked  to  state  the  most  difficult  or  irritating 
thing  that  occurred  during  their  last  search  and  to  name  the  easiest 
or  most  gratifying  thing  that  happened.  The  results  of  these  two  ques¬ 
tions  are  shown  in  Table  I.  They  were  also  asked  what  advice  they  would 
give  a  new  young  engineer  embarking  on  the  same  type  of  search  to  make 
the  search  easier  and  what  pitfalls  they  would  point  out  to  him.  Table 
II  contains  the  tabulation  of  responses  to  these  questions. 

The  responses— perhaps  due  to  the  wording  of  the  questions—were 
extremely  varied.  The  interviews  showed  that  instead  of  there  being 
several  requirements  that  are  of  extreme  importance,  tlsre  is  actually 
a  wide  array,  all  of  which  are  of  some  importance  to  the  performance 
of  the  system.  The  list  of  requirements  subjected  to  measurement  during 
this  study  did  not  appear  to  exclude  any  of  great  importance. 

The  most  frequently  mentioned  factors  concerning  the  subject's  last 
search  referred  to  relevant  material  produced.  There  were  a  number  of 
general  comments  (20  percent)  on  the  ease  with  which  relevant  references 
were  found  and  documents  obtained.  There  were  also  a  number  of  comments 
(26  percent)  concerning  the  ease  with  which  the  actual  document  is  found 
after  a  reference  to  it  is  located. 


In  this  Section,  "positive"  comments  mean  those  comments  that  are 
complimentary  to  the  present  system.  "Negative"  comments  are  those 
that  are  uncomplimentary  or  derogatory  to  the  present  system. 
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Table  I 


CRITICAL  REQUIREMENT'S  LISTED  BY  ELECTRICAL  ENGINEERS 
IN  RELATION  TO  THEIR  MOST  RECENT  SEARCH 

Percent,  of  Engineers 
Making  Comment 

SEARCHER 

Subject  was  in  own  field  0 

Had  his  own  source  7 

Gained  Information  personally  useful  -T 

Knew  someone  or  met  someone  who  knew  sources  4 


SYSTEM — Relevant  Material  Produced 

Finding  references  and  documents^  finding  them 

easily  (or  finding  nothing  if  tliat  is  aim)  28 

Ease  of  getting  document  after  reference  to  it 

found  26 

Good  bibliographies,  abstracts,  indexes  produced  17 

SYSTEM — Operation 

Adequate  indexing,  ease  of  understanding  indexing  15 

Ease  of  communication  with  system  11 

Adequate  cross  referencing  11 

SYSTEM — Irrelevant  Material  Produced 

Need  less  irrelevant  material  12 

SYSTEM— Time 

Receive  material  in  short  time  9 


SYSTEM— File  Size 

Need  for  foreign  literature,  translations 

SYSTEM — Relevant  Material  Missed 

When  you  know  Information  exists,  want  to  be  able 
to  find  it;  want  to  be  sui'e  you  have  all  the 
good  sources 

SYSTEM — Provision  of  Copies  of  Documents 


To  get  copies  of  material  easily  4 

PROBLEMS  OUTSIDE  CONTROL  OF  SYSTEM 

Matei'ial  classified,  difficult  to  obtain  B 

Subject  too  new,  no  material  available  5 

Not  much  written  on  subject  3 

Material  unpublished,  available  only  from  individuals  3 

Base  (92) 


Note:  The  above  data  were  obtained  by  combining  responses  to  the  two 

following  questions:  Question  5a — "Do  you  recall  anything  happen¬ 
ing  during  the  search  that  made  it  an  easier  or  better  search,  or 
that  made  the  search  difficult?  For  example,  what  was  the  most 
difficult  or  irritating  thing  that  happened?"  Question  5b — "What 
was  the  easiest  or  most  gratifying  thing  that  happened?"  "Other" 
and  "no  answer"  responses  have  not  been  Included.  Duplicate  rc- 
sixanses  (one  individual  giving  same  answer  to  both  questions)  were 
eliminated. 
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Table  II 


SUGGESTIONS  THAT  ELECTRICAL  ENGINEERS  WOULD  MAKE 
TO  A  YOUNG  ENGINEER  STARTING  A  SEARCH 


SEARCHER 

Talk  to  men  who  are  in  the  field 
Be  informed  on  your  subject 

Define  the  problem  clearly,  specify  scope  before 
starting 

Go  to  library  yourself,  be  aware  of  library 
facilities 

SYSTEM—File  Size 

Use  abstracts,  indexes 
Try  ASTIA 

Use  journals  in  the  field 

Note  references  and  bibliographies  given  in 
technical  articles 

Look  at  bibliographies  that  are  available 
Try  textbooks 

SYSTEM — Irrelevant  Material 

Scan  rapidly,  discard  Irrelevant  material  quickly 

SYSTEM — Descriptors 
Use  enough  key  words 

Use  computer,  descriptors  for  computer 

SYSTEM— Time 
Be  patient 

SYSTEM — Evaluation  of  Material 

Don't  believe  everything  you  read,  select  reliable 
sources 

SYSTEM — Relevant  Material  Missed 

Make  sure  you  look  at  all  sources  of  information 

SYSTEM-”Time  Period  Covered  by  Documents 

Obtain  current  information — weed  out  the  old 

OPERATOR  OF  SYSTEM 
Ask  the  librarian 
Don't  ask  the  librarian 

Base 


Percent  of  Engineers 
Making  Comment _ 

23 

10 


17 

13 

10 

12 

0 

5 

4 

4 

11 


16 

3 

(92) 


Note:  The  above  data  were  obtained  by  combining  responses  to  the  two 
following  questions:  Question  5c — "if  a  young  engineer  who  had 
just  joined  the  staff  were  starting  this  same  search  today,  what 
advice  would  you  give  him  to  make  the  search  easier?"  Question 
5d--"What  would  you  warn  him  about?"  "Other"  and  "no  answer"  re¬ 
sponses  have  not  been  included.  Duplicate  responses  (one  individual 
giving  same  answer  to  both  questions)  were  eliminated. 
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References  to  good  bibliographies,  abstracts,  or  indexes  produced 
by  the  search  were  made  by  17  percent,  almost  all  in  positive  terms. 

Also  mentioned  by  a  number  of  respondents  were  the  indexing  system  (15 
percent)  and  cross  referencing  (11  percent) .  All  of  these  responses 
were  negative. 

Of  the  11  percent  that  referred  to  ease  of  communication  with  the 
system,  some  found  it  satisfactory  and  others  did  not.  Of  the  12  per- 
cent  of  the  respondents  who  mentioned  irrelevant  material,  all  mentioned 
it  unfavorably. 

There  were  also  some  responses  concerning  the  last  search  that  are 
not  directly  related  to  a  system.  For  example,  there  were  a  number  of 
references  to  the  knowledge  and  sources  the  Individual  brings  to  the 
search  before  starting: 

"The  search  was  a  little  bit  out  of  my  field,  which  made  it 

harder," 

"I've  subscribed  to  IRE  since  1949  so  had  my  own  source.” 

”l  was  fortunate  enough  to  meet  a  man  at  a  Berkeley  meeting 

who  knew  just  where  to  look," 

This  type  of  response  was  even  more  frequent  in  offering  advice  to  a  young 
engineer  starting  a  search.  The  following  comments  are  typical: 

"Have  as  much  information  as  you  can  on  the  subject  before 

you  start.” 

"Talk  to  people  who  are  familiar  with  this  area  of  investigation." 

While  no  system  could  perform  the  functions  implied  by  such  comments,  it 
is  possible  that  a  system  more  adequately  meeting  other  direct  require¬ 
ments  (e.g,,  producing  all  relevant  documents  on  the  subject)  would  re¬ 
duce  the  amount  of  time  and  effort  required  of  the  individual  searcher 
in  preparing  for  the  search. 
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c. 


Measurement  of  Selected  Requirements 


The  purpose  of  the  series  of  detailed  questions  on  the  most  recent 
search  was  to  obtain  data  on  requirements  that  could  be  measured,  and 
to  obtain  opinions  on  those  that  could  not.  In  the  case  of  file  size, 
minimal  information  was  obtained  because  of  the  concentrated  effort 
other  studies  have  made  on  this  one  requirement.  Four  measurements  were 
obtained  where  possible:  actual  performance  of  the  present  system, 
desired  performance,  minimum  performance  that  is  acceptable,  and  rank 
order  of  importance  in  system  performance. 

Concerning  time  required  to  obtain  the  major  group  of  relevant  re¬ 
ferences,  the  actual  and  the  needed  performance  were  quite  similar.  The 
importance  of  promptness  in  providing  documents  is  quite  evident.  Over 
one-fourth  of  the  subjects  received  the  references  in  one  day  or  less, 
and  almost  half  in  throe  days  or  less.  The  miinimum  acceptable  perfor¬ 
mance  level  was  considerably  lower — 65  percent  could  have  waited  two 
weeks  or  more  (Questions  11a,  11b,  and  11c  in  Appendix  F) . 

The  need  lor  current  material  was  also  expressed.  About  one- third 
received  some  documents  that  were  under  3  months  old,  and  a  slightly 
higher  proportion  (37  percent)  said  they  needed  such  current  material. 
Minimum  performance  would  have  permitted  older  materjol.  over  half  said 
they  would  have  been  satisfied  with  documents  that  were  all  over  2  years 
old  (Questions  12a,  12b,  and  12c  in  Appendix  F) . 

The  actual  form  in  which  documents  came  to  the  users,  and  their 
preferences  for  form,  did  not  coincide  closely.  The  great  majority  (81 
percent)  received  at  least  some  complete  documents.  Citations  were  re¬ 
ceived  by  45  percent,  abstracts  by  42  percent,  and  document  numbers  by 
only  2  percent.  However,  68  percent  said  abstracts  are  a  preferred  form 
and  64  percent  said  complete  documents  are  a  preferred  form  (more  than 
one  preference  could  be  given)  .  Almost,  all  (97  percent)  said  that  docu¬ 
ment  numbers  are  an  inadequate  search  product:  over  half  (54  percent) 
said  citations  are  an  inadequate  search  product  (Questions  13a,  13b, 

13c,  and  13d  in  Appendix  F) . 


Apparently  irrelevant  material  is  not  considered  to  be  a  great 
problem  among  respondents.  Concerning  the  amount  of  time  respondents 
personally  spent  on  the  search^  41  percent  said  that  less  than  one-fourth 
of  their  total  time  was  spent  culling  out  duplicate  and  irrelevant  mat¬ 
erial.  Forty-four  percent  indicated  that  less  than  one-fourth  of  their 
effort  should  be  spent  in  this  way.  If  necessary^  respondents  would 
have  been  willing  to  spend  much  more  time  eliminating  irrelevant  docu¬ 
ments;  45  percent  said  they  would  have  spent  a  maximum  of  three-fourths 
or  more  of  their  time  getting  rid  of  unnecessary  material  (Questions  14a^ 
14bj  and  14c  in  Appendix  F) . 

General  questions  were  asked  to  determine  who  conducted  the  search, 
where  it  was  conducted,  and  how  the  search  request  was  specified.  The 
great  majority  of  respondents  (00  percent)  participated  personally  in 
the  search.  Librarians  participated  in  27  percent  of  the  searches 
(Question  6  in  Appendix  F) .  Almost  all  respondents  said  the  search  was 
conducted  at  least  partially  in  their  own  organization's  library.  How¬ 
ever,  other  sources  were  also  used,  either  directly  or  through  the 
organizational  library.  University  libraries  were  mentioned  by  32  per¬ 
cent,  ASTIA  by  25  percent,  and  other  sources  by  17  percent  (Question  8 
in  Appendix  F) .  There  was  some  variation  in  the  way  the  search  was 
specified.  While  almost  half  (46  percent)  said  they  used  specific  terms 
or  key  words,  23  percent  said  they  described  the  problem  generally,  13 
percent  said  they  used  several  broad  headings,  and  15  percent  said  they 
were  "fairly"  specific  (Table  F-1,  Appendix  F) . 

Some  questions  also  were  asked  concerning  time  and  effort  vs,  com¬ 
pleteness  of  the  search.  As  indicated  in  Sec.  VII,  these  questions  were 
experimental ,  The  data  should  be  regarded  as  indicative  only,  since 
respondents  probably  cannot  reply  realistically  to  such  questions. 
Respondents  were  asked  how  long  they  could  wait  for  a  search  covering 
50  percent  of  the  potential  sources,  for  one  covering  80  percent,  and 
for  one  covering  all  or  almost  all  potential  sources.  Although  the  trend 
was  definitely  toward  a  longer  wait  for  a  greater  number  of  sources, 
there  was  little  agreement  among  respondents  on  the  amount  of  time  they 
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would  be  willing  to  wait.  Answers  were  quite  varied.  The  median  fell 
in  the  0-  to  13-day  category  for  a  search  covering  50  percent  of  th(5 
sources,  in  the  2-  to  3-week  category  for  00  percent  of  the  sources, 
and  In  the  4-  to  7-week  category  for  all  or  almost  all  of  the  sources 
(Table  F-3  in  Appendix  F) . 

In  the  same  series  of  questions,  respondents  were  asked  how  much 
of  their  own  working  time  they  would  be  willing  to  spend  if  they  could 
be  sure  50  percent,  80  percent,  or  almost  all  relevant  sources  were 
located.  The  median  fell  in  the  2-  to  4-day  category  for  searches 
locating  50  to  80  percent  of  the  relevant  sources,  and  in  the  1-week 
but  less  than  2-week  period  for  a  search  locating  almost  all  the  rele¬ 
vant  sources  (Table  F-4  in  Appendix  F) . 

Respondents  were  also  told  to  assume  that  a  search  had  covered 
material  up  through  two  years  ago,  which  required  X  amount  of  their 
own  working  time.  They  were  then  asked  how  much  additional  time  they 
would  personally  spend  to  update  the  material  to  within  1  year,  within 
6  months,  and  within  1  month.  The  median  category  to  update  from  2 
years  to  1  year  was  an  additional  1/2  X  to  ]  X,  The  median  to  update 
from  2  years  to  6  months  and  from  2  years  to  1  month  was  2  X  to  4  X 
(Table  F-5  in  Appendix  F) . 

Two  bro.ad  questions  were  asked  concerning  fj.le  size.  First  there 
was  a  question  concerning  how  often  respondents  (sould  have  used  searches 
(regardless  of  existing  facilities)  covering  varying  numbers  of  sources 
over  the  Iasi;  five  years  of  publication.  Respondents  were  then  asked 
how  their  answers  would  change  if  they  had  not  been  limited  to  five 
years.  The  great  majority  (82  percent)  often  could  have  used  a  search 
covering  15  or  fewer  journals  over  the  last  five  years  of  pubH cation. 
More  extensive  coverage,  in  terms  of  numbers  of  sources,  could  have  been 
used  by  the  majority  occasionally.  However,  even  though  they  were 
offered  the  capability,  the  users  seldom  wanted  to  search  the  entire 
world's  literature  to  answer  their  question.  Very  few  respondents  said 
they  would  have  more  occasion  to  search  the  files  listed  if  they  were 
not  limited  to  the  last  five  years  (Questions  19a  and  19b  in  Appendix  F) . 
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Although  the  sample  was  too  small  to  permit  extensive  cross  tabula¬ 
tions^  some  of  the  data  wez’e  tabulated  according  to  organizational 
affiliation.  The  number  from  each  organization  is  quite  small,  but 
some  of  the  differences  are  worth  noting.  For  example,  one  participating 
company  has  facilities  for  computer  searching.  In  that  company,  fewer 
respondents  personally  conducted  their  own  search  than  those  in  other 
organizations.  The  length  of  time  these  respondents  had  to  wait  to 
receive  references — and  the  length  of  time  they  said  they  should  have 
to  wait — was  less  than  that  reported  by  other  respondents.  The  majority 
of  these  respondents  received  some  references  in  the  form  of  citations, 
and  considered  complete  documents  adequate  but  preferred  abstracts. 
Respondents  from  the  same  company  spent  a  greater  proportion  of  the 
total  time  spent  on  the  search  culling  out  irrelevant  material  and 
Indicated  that  a  larger  proportion  of  time  was  the  tolerable  level  for 
this  activity  than  did  respondents  from  other  companies.  These  and 
other  differences,  while  not  conclusive,  are  evidence  that  the  facilities 
available  to  the  individual  have  an  effect  on  his  searching  habits.  It 
appears  that  the  individual  states  his  needs  in  terms  that  are  realistic 
within  the  capabilities  of  the  system  that  is  available  to  him. 

As  stated  earlier,  11  metallurgists  were  also  interviewed.  The 
purpose  of  these  Interviews  was  to  determine  whether  or  not  the  inter¬ 
view  guide  could  be  applied  to  persons  in  other  fields.  Certain  minor 
and  obvious  changes  would  have  to  be  made  for  subsequent  surveys  in 
fields  outside  of  electronics,  such  as  reference  to  searches  in  the 
field  of  electronics.  Interviews  with  the  metallurgists  produced  minor 
variations  in  responses,  but  in  general  the  guide  worked  as  well  as  it 
had  with  electrical  engineers.  One  difference  in  response,  as  would  ho 
expected,  was  the  number  of  references  to  special  information  facilities 
already  available  within  the  field  of  metallurgy. 

D .  .'Vnalysic  of  Respondent  Rankings 

In  Question  15  of  the  questionnaire  the  respondent  is  asked  to 
rank  (arrange)  seven  document  retrieval  system  characteristics  by  order 
of  importance — a.ssigning  1  to  most  important,  2  to  the  second  most 
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Important,  and  on  down  to  7,  the  number  assigned  to  the  least  important 
characteristic.  If  two  or  more  characteristics  are  considered  to  be 
equally  important,  for  instance,  if  the  respondent  ties  the  third  and 
fourth  ranked  characteristics,  then  each  is  alloted  the  average  of  the 
ranks,  in  this  case  the  rank  3-1/2. 

The  characteristics,  labeled  A-G,  are; 

(A)  Minimum  time  to  get  the  major  group  of  relevant 
references  to  you . 

(B)  Minimum  of  irrelevant  material  produced  by  the 
search 

(C)  Minimum  of  relevant  material  overlooked  by  the 
search 

(D)  References  come  to  you  in  form  you  prefer  (com¬ 
plete  document,  abstract,  citation,  or  document 
number) 

(E)  Assurance  that  documents  on  a  given  subject  do 
not  exist 

(F)  Minimum  of  effort  on  your  part  to  communicate 
your  request  for  a  search 

(G)  Certainty  that  specified  sources  over  certain 
period  of  time  were  searched  (certainty  that  100 
percent  of  the  sources  were  searched,  certainty 
that  90  percent  were  searched  but  10  percent  may 
not  have  been  searched,  etc.). 

1 .  Rank  Correlation 

The  reason  respondents  were  asked  to  rank  rather  than  measure 
the  importance  of  data  retrieval  system  (DRS)  clmracteri sties  is  due  to 
the  difficulty  of  constructing  an  objective  scale  for  such  measurements. 
Even  if  importance  was  a  measurable  quality,  it  would  not  be  sufficient 
to  know  that  a  respondent  thought  characteristic  A  to  be  20  percent  more 
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important  than  characteristic  B  without  also  knowing  the  equivalent  of 
100  percent  on  some  objective  scale. 

There  are  two  questions  concerning  rankings  that  can  be 

answered  by  the  methods  developed  (see  Appendix  A)  in  the  theory  of 
7 

rank  correlation. 

(1)  What  is  the  agreement,  or  concordance, 
among  the  individual  x-ankings,  and 

(2)  What  is  the  "true"  ranking  of  the  per¬ 
formance  characteristics. 

It  should  be  noted  that  a  ranking  does  not  tell  how  close  the  character¬ 
istics  are  on  some  scale.  However,  a  ranking  is  unaltered  if  the  scale 
is  stretched.  An  example  that  illustrates  these  qualities  is  found  in 
a  track  meet.  The  finishing  order  in  a  race  is  independent  of  the  time 
scale  used  to  measure  the  race.  However,  if  the  order  in  which  the 
runners  passed  the  finish  line  is  all  that  is  known,  then  it  is  not 
possible  to  determine  how  close  the  runners  were  to  one  another. 

Table  III  summarizes  the  rankings  obtained  from  92  question¬ 
naires.  The  rank  totals  are  the  totals  of  the  numbers  between  1  and  7 
assigned  to  each  characteristic.  The  smaller  the  sum,  the  more  Important 
the  characteristic;  therefoi-e,  the  final  ranking  proceeds  from  the 
smallest  sum  to  the  highest. 

Table  III 

RANKING  BASED  ON  92  QUESTIONNAIRES 

Characteristic  Rank  Totals  Final  Ranking 

A  231.0  1 

B  466,0  7 

C  292.5  2 

D  373.0  4 

E  390.0  5 

F  456.0  6 

G  367.5  3 
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Two  different  statistics  for  measuring  rank  correlation  will  be 
used  in  the  remainder  of  the  discussion.  The  first^  the  coefficient  of 
concordance,  W,  is  used  when  three  or  more  rankings  are  compared.  The 
second,  the  coefficient  of  rank  correlation,  t,  is  used  when  two  rankings 
are  compared. 

2 .  Test  of  Significance 

The  92  respondents  are  a  sample  from  a  larger  population; 
although  It  is  of  some  interest  to  measure  the  relationship  between 
document  retrieval  system  characteristics  and  their  importance  to  these 
92  individuals,  it  is  of  much  greater  interest  to  be  able  to  generalize 
the  results  to  the  parent  population.  This  involves  a  test  of  the  sig¬ 
nificance  of  the  rank  correlation  statistic  computed  from  the  sample. 

To  test  the  significance  of  some  sample  statistic,  the  observed 
value  of  the  statistic  is  compared  to  the  entries  in  a  frequency  distri¬ 
bution  of  all  values  the  statistic  may  take  on.  Each  of  the  possible 
values  in  the  frequency  distribution  has  a  certain  probability  of 
occurrence.  If  the  probability  that  a  random  occurrence  of  the  observed 
value  of  the  statistic  is  sufficiently  low  (say  0,01),  then  it  is 
possible  to  conclude  that  the  observed  value  is  significant.  In  the 
present  context,  a  significant  value  of  the  coefficient  of  concordance 
implies  agreement  among  the  respondents  in  their  ranking  of  retrieval 
system  characteristics.  In  the  following  tests,  rankings  of  retrieval 
system  characteristics  are  said  to  agree  if  there  is  one  chance  in  a 
hundred  of  attaining  or  bettering  the  observed  value  of  the  sample 
statistic  (W  or  t)  by  chance  alone.  The  one  percent  significance  level 
Is  commonly  used  in  statistical  tests.  Methods  for  testing  the  signi¬ 
ficance  of  W  and  t  are  discussed  in  Chapters  4  and  6  of  Kendall's  book,^ 

The  value  of  the  coefficient  of  concordance  deri.ved  from  the 
92  responses  is  W  =  0.1705.  This  value  lies  far  beyond  the  one-percent 
significance  point;  that  is,  the  probability  of  arriving  at  the  observed 
or  a  greater  value  by  chance  is  less  than  one  in  a  hundred.  On  the  basis 
of  this  test  it  is  fair  to  conclude  that  there  is  agreement  among  the 
92  rankings. 
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study  of  Table  III  reveals  that  the  chief  reason  for  the  sig¬ 
nificance  of  W  is  the  almost  universal  agreement  on  the  importance  of 
characteristic  A  (the  minimum  time  characteristic) .  This  situation  can 
be  compared  to  ranking  seven  milers — Olympic  champion  Herb  Elliot  and 
six  high  school  runners — on  the  basis  of  a  series  of  test  races.  Even 
were  the  six  high  school  milers  equally  matched^  so  that  their  finishing 
order  was  randomj  the  fact  that  Elliot  always  came  in  first  would  tend 
to  yield  a  significant  coefficient  of  concordance  over  the  observed 
trials . 

The  dominance  of  characteristic  A  is  eliminated  by  computing 
and  testing  the  significance  of  the  coefficient  of  concordance  computed 
for  the  six  characteristics  B-G.  This  was  done  and  the  value  V/  =  0.0603 
also  proved  significant  at  the  one-percent  level.  In  the  remaining 
analysis  the  significance  of  W  and  t  is  tested  for  characteristics  A-G 
and  T  for  characteristics  B-G.  The  letter  ”s’*  indicates  significant 
agreement;  the  letters  **NS,"  non-significance. 

3.  Ranking  Within  Categories  , 

a .  Ranking  Within  Companies 

It  seems  reasonable  to  assume  that  the  respondent's  attitude 
about  document  retrieval  system  characteristics  is  conditioned  by  the  re¬ 
trieval  system  available  to  him.  To  test  this  assumption,  the  92  rankings 
were  grouped  by  company  and  the  coefficient  of  concordance  computed  for 
the  responses  within  each  company.  Tabic  IV  summarizes  the  calculations. 

Table  IV 

RESULTS  or  TESTS  FOR  AGREEMENT  WITHIN  COMPANIES 


Agreement  at  0.01  Level 

Company 

Sample 

Size 

Characterl sties 

A-G 

Characteristics 

B-G 

SRI 

22 

S 

S 

Sylvanla 

27 

S 

NS 

IBM 

1C 

NS 

NS 

Lockheed 

25 

S 

S 
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Note  that  people  within  separate  companies  could  not  always  agree  among 
themselves  as  to  the  relative  importance  of  the  various  requirements. 


b.  Ranking  Within  Job  Classifications 

Another  interesting  hypothesis  was  that  there  would  be 
agreement  on  the  rankings  within  different  job  classifications — that  is, 
Research  Managers  would  agree  on  what  the  important  requirements  are. 

To  test  this  hypothesis,  the  92  engineers  were  classified  by  their  answers 
to  Question  23  (Appendix  F) .  The  results  are  shown  in  Table  V. 

Table  V 

RESULTS  OF  TESTS  FOR  AGREEMENT  WITHIN  JOB  CLASSIFICATIONS 


Agreement  at  0.01  Level 

Job  Ci assif icatlon 

Sample 

Size* 

Characteristics 

A-G 

Characteristics 

B-G 

Research  Manager 

17 

S 

NS 

Senior  Engineer 

44 

S 

NS 

Engineer 

26 

s 

S 

Junior  Engineer 

4 

NS 

NS 

* 

One  respondeat  did  not  classify  his  job. 

From  the  test  results,  it  appears  that,  aside  from 
characteristic  A,  there  is  almost  complete  disagreement  within  all  job 
classifications  about  the  relative  importance  of  retrieval  system  charac¬ 
teristics  . 

c .  Ranking  Within  Academic  Degree  Groups 

Another  significance  test  was  run  on  the  92  responses 
grouped  by  academic  degree.  The  results  for  four  categories  are  shown 
in  Table  VI,  Within  each  academic  degree  there  is  complete  agreement 
about  the  relative  importance  of  the  requirements  when  characteristic 
A  is  Included.  Without  characteristic  A,  there  is  complete  disagree¬ 
ment  within  each  academic  degree. 
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Table  VI 

RESULTS  OF  TESTS  FOR  AGREEMENT  WITHIN  ACADEMIC  DEGREE  GROUPS 


Agreement  at  0*01  Level 

Highest  Degree  Held 

Sample 

Size 

Characteristics 

A-G 

Characteristics 

B-G 

BSEE 

26 

S 

NS 

MSEE 

35 

3 

NS 

Engineer 

7 

S 

NS 

PhD,  ScD 

14 

S 

NS 

d.  Ranking  Within  Author  and  Non-Author  Categories 

The  amount  of  searching  performed,  and  consequently  the 
Information  requirements,  may  depend  on  whether  the  respondent  has  written 
any  books,  papers,  or  articles.  To  test  this  hypothesis,  the  concordance 
coefficient  was  computed  for  the  rankings  after  the  engineers  were  grouped 
Into  those  that  had  published,  and  those  that  had  not.  The  results  are 
shown  in  Table  VII. 

Table  VII 

RESULTS  OF  lESTS  FOR  AGREEMENT  WITHIN  AUTHOR  AND 
NON-AUTHOR  CATEGORIES 


Agreement  at  0.01  Level 

Author  Category 

Sample 

Characteristics 

of  Respondent 

Sire 

A-G 

Did  not  publish 

47 

S 

NS 

Did  publish 

45 

S 

S 

Doth  groups  agreed  within  themselves  when  characteristic  A  was  included. 
Otherwise,  only  the  group  of  authors  agreed. 

e .  Ranking  Within  Age  Groups 

It  is  possible  that  information  requirements  might  depend 
upon  the  user's  age;  consequently,  a  test  was  run  on  the  agreement  within 
each  age  group.  The  results  are  shown  in  Table  VIII. 
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Tabic  VIII 

RESULT  OF  TESTS  FOR  AGREEMENT  WITHIN  AGE  GROUPS 


Agreement  at  0.01  Level 

Age  Group 

Sample 

Size 

Characteristics 

A~G 

Characteristics 

B-G 

25-29 

♦ 

21 

S 

NS 

30-34 

27 

S 

NS 

35-39 

24 

S 

NS 

40-44 

16 

s 

NS 

45  and  over 

3 

NS 

. 

NS 

The  group  of  under  23  years  had  only  one  member  and  was  not  considered 
further. 


This  test  indicates  that  in  this  age  group  there  is  al¬ 
most  general  agreement  on  rankings  when  characteristic  A  is  included^ 
and  complete  disagreement  otherwise. 

f .  Ranking  Within  Specialty  Fields 

It  was  hypothesized  that  the  rankings  would  be  different 
within  specialty  fields.  A  test  was  run  on  the  agreement  within  specialty 
groups  and  the  results  are  shown  in  Table  IX. 

Table  IX 

RESULT  OF  TESTS  FOR  AGREEMENT  WITHIN  SPECIALTY  FIELDS 


Agreement 

at  0.01  Level 

Specialty  Field 

Sample 

Size 

Characteristics 

A-G 

Characteristics 

B-G 

Circuits  and  devices 

40 

S 

NS 

Microwave  and  communication 

19 

S 

S 

Antennas  and  propagation 

9 

S 

NS 

Communication  theory 

6 

NS 

NS 

All  others 

18 

NS 

NS 

There  was  some  agreement  within  specialty  fields  when  characteristic  A 
was  considered;  othei-wise  there  was  generally  disagreement  within  each 
specialty  field. 
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4.  Rankings  Between  Categories 


Where  there  is  significant  agreement  among  the  responses  within 

sje 

a  category^  it  is  possible  to  compare  rankings  between  categories.  For 
example,  the  employees  at  SRI  agreed  on  the  ranks  assigned  to  the  re¬ 
trieval  characteristics  B-G.  The  same  can  be  said  of  the  Lockheed  em¬ 
ployees.  Assuming  the  samples  represent  SRI  and  Lockheed  worker 
attitudes,  it  is  reasonable  to  test  the  agreement  between  (not  within) 
the  SRI  and  Lockheed  rankings. 

The  following  analyses  are  restricted  to  comparisons  of  those 
categories  whose  members  agreed  in  their  rankings  of  the  retrieval  system 
characteristics;  i.e.,  categories  in  Tables  IV-IX  in  which  agreement  at 
the  0.01  level  is  significant. 

a .  Rankings  Between  Companies 

Table  X  shows  the  rankings  of  characteristics  A-G  derived 
from  the  various  companies. 

Table  X 

RESULT  OF  TESTS  FOR  AGREEMENT  BETWEEN 
COMPANIES— CHARACTERISTIC  A  INCLUDED 


Company 

Characteristics 

A 

B 

C 

D 

E 

F 

G 

SRI 

2 

7 

1 

3 

4 

6 

5 

Sylvania 

1 

7 

3 

5 

4 

6 

2 

Lockheed 

1 

6 

2 

4 

5 

7 

3 

Consensus 

1 

7 

2 

4 

5 

6 

3 

The  value  of  the  concordance  coefficient  W  =  0.065  is 
.significant  at  the  0,01  level.  A  comparison  of  the  rankings  of  charac¬ 
teristics  B-G  is  shown  in  Table  XI, 


If  the  members  of  a  category  can  not  agree  among  themselves,  there  is 
no  point  in  looking  for  agreement  between  this  category  and  another. 
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Table  XI 


RESULT  OF  TESTS  FOR  AGREEMENT  BETWEEN 
COMPANIES— CHARACTERISTIC  A  EXCLUDED 


Characteristics 

Company 

B 

D 

E 

F 

G 

SRI 

6 

■ 

2 

3 

5 

4 

Lockheed 

5 

B 

2 

4 

6 

3 

Consensus 

5,5 

1 

2 

3.5 

5.5 

-  3.5 

The  coefficient  of  rank  correlation  has  the  value  r  =  0.73, 
which  has  three  chances  in  100  of  being  equalled  or  bettered  by  chance 
alone.  This  is  not  below  the  .01  level  used  to  define  significant  agree¬ 
ment  . 

b .  Rankings  Between  Job  Classifications 

Table  XII  shows  the  rankings  of  characteristics  A-G  de- 
rived  from  the  various  job  classifications. 

Table  XII 

RESULT  OF  TESTS  FOB  AGREEMENT  BETWEEN  JOB  CLASSIFICATIONS 


Job  Classification 

1  Characteristics  | 

A 

B 

C 

D 

E 

F  i 

G 

Research  manager 

1 

7 

4 

2 

5 

i 

6 

3 

Senior  engineer 

1 

7 

2 

n 

5 

6 

4 

Engineer 

2 

6 

1 

5 

3 

7 

4 

Consensus 

1 

7 

2 

3 

5 

6 

4 

The  coefficient  of  concordance  W  =  0.825  is  significant 
at  the  0.01  level.  These  three  job  classifications  agree  within  them¬ 
selves  and  between  each  other  when  characteristic  A  is  included. 
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Rankings  Between  Academic  Degree  Groups 


c . 

Table  XIII  shows  the  rankings  of  characteristics  A-G  by 
academic  degree.  The  concordance  coefficient  W  =  0.915  is  significant 
at  the  0.01  level. 

Table  XIII 

RESULT  OF  TESTS  FOR  AGREEMENT  BETWEEN  ACADEMIC  DEGREE  GROUPS 


Academic  Degree 

Characteristics 

A 

B 

C 

D 

E 

F 

G 

M 

6 

2 

5 

4 

7 

3 

B 

B 

2 

4 

6 

5 

Engineer 

2 

H 

1 

3 

5 

6 

4 

PhD,  ScD 

B 

B 

2 

4 

5 

6 

3 

Consensus 

■ 

B 

2 

3.5 

5 

6 

3.5 

All  the  academic  categories  agree  within  themselves  and  between  each 
other  when  characteristic  A  is  included. 

d .  Rankings  Between  Author  and  Non-Author  Categories 

The  author,  non-author  rankings  are  shown  in  Table  XIV. 

Table  XIV 

RESULT  OF  TESTS  FOB  AGREEMENT  BETWEEN  AUTHOR  AND 
NON-AUTHOR  CATEGORIES 


Character! 

sties 

Author  Category 

A  ^ 

B 

C 

D 

E 

F 

G 

Did  not  publish 

B 

6 

2 

3 

5 

B 

4 

Did  publish 

B 

7 

2 

4.5 

4.5 

Hi 

3 

Consensus 

B 

6.5 

2 

4 

5 

6.5 

3 

The  value  of  the  coefficient  of  rank  correlation  is 
X  =  0.87B,  which  is  significant  at  the  0.01  level.  These  two  categories 
agree  within  themselves  and  between  each  other  when  characteristic  A  is 
considered. 


41 


c 


Rankings  Between  Age  Groups 


Table  XV. 


The  rankings  between  the  four  age  groups  are  shown  in 


Table  XV 

RESULT  OF  TESTS  FOR  AGREEMENT  BETWEEN  AGE  GROUPS 


Age  Group 

Characteristics 

A 

B 

C 

D 

E 

F 

G 

25-29 

1 

6 

2 

5 

3 

7 

4 

30-34 

1 

7 

2 

3 

5 

6 

4 

35-39 

1 

7 

2 

4.5 

4.5 

6 

3 

40-44 

1 

7 

2 

4 

6 

5 

3 

Consensus 

1 

7 

2 

4 

5 

6 

3 

The  concordance  coefficient  is  W  =  0,905,  which  is  signi¬ 
ficant  at  the  0.01  level.  The  members  of  these  age  groups  agree  within 
themselves  and  between  each  other  when  characteristic  A  is  considered, 

f .  Ranking  Between  Specialty  Fields 

The  rankings  between  three  specialty  groups  is  shown  in 

Table  XVI, 


Table  XVI 

RESULT  OF  TESTS  FOR  AGREEMENT  BETWEEN  SPECIALTir  FIELDS 


Specialty  Field 

Characteristics 

A 

B 

C 

D 

E 

F 

G 

Circuits 

1 

7 

■ 

6 

4 

Microwave 

a 

6 . 5 

D 

6,5 

2.5 

Antennas 

n 

7 

B 

6 

3 

Consensus 

1 

7 

2 

4 

5 

6 

3 

The  value  of  the  concordance  coefficient  W  =  0.94  is 
significant  at  the  0.01  level.  These  three  specialty  fields  agree  within 
themselves  and  between  each  other  when  characteristic  A  i.‘5  Included. 
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5. 


General  Coinments  About  the  Rankings 


In  general^  there  is  disagreement  about  the  relative  impor¬ 
tance  of  Characteristics  A-G.  Even  though  a  composite  ranking  was  ob¬ 
tained  (Table  III)  further  analysis  showed  that  there  was  disagreement 
within  nearly  every  sub-group  of  the  sample  population.  In  only  two 
of  the  six  breakdowns  (grouping  by  academic  degree  and  by  author  vs. 
non-author)  did  each  of  the  sub-groups  of  that  breakdown  agree  within 
themselves — and  this  was  when  Characteristic  A  was  included.  When 
Characteristic  A  was  excluded,  there  was  disagreement  within  at  least 
one  sub-group  of  each  breakdown,  and  in  two  breakdowns  (grouping  by 
academic  degree  and  by  age)  there  was  disagreement  within  every  single 
sub-group  of  those  breakdowns.  Sub-groups  with  Internal  agreement 
always  had  substantial  agreement  between  them. 

One  thing  seems  certain  as  a  result  of  this  ranking  study: 
Characteristic  A  (minimum  time  to  obtain  the  major  group  of  relevant 
references)  seems  to  be  very  important  to  all  of  the  users.  It  is  also 
clear  that  the  users  are  generally  uncertain  and  in  disagreement  about 
the  relative  importance  of  the  remaining  characteristics.  Further 
studies  of  relative  rankings  should  give  some  attention  to  finding  ways 
of  incorporating  greater  resolution  and  accuracy  in  the  measurements, 
and  of  improving  the  list  of  requirements. 


43 


V  A  GENERAL  FUNCTIONAL  MODEL  OF  AN  INFORMATION  RETRIEVAL  SYSTEM 


A.  The  Need  for  a  Model 

A  model  is  a  useful  tool  for  describing  phenomena  of  interest.  It 
provides  a  means  by  which  a  phenomenon  can  be  reduced  to  its  basic  ele¬ 
ments,  thus  simplifying  subsequent  exploration  and  analysis.  It  may 
also  serve  as  a  useful  intellectual  exercise,  compelling  the  researcher 
to  check  that  all  significant  points  have  been  considered  in  his  analysis. 
Most  Important,  it  serves  as  the  framework  for  analytical  or  simulation 
studies  of  the  system. 

Simulation  techniques  can  be  profitably  used  to  predict  the  per¬ 
formance  of  an  information  retrieval  system  under  a  variety  of  operating 
conditions  and  for  a  variety  of  system  configurations.  In  this  way, 
proposed  retrieval  systems  can  be  studied  to  determine  costs  and  per¬ 
formance,  without  actually  installing  or  operating  such  systems.  Al¬ 
though  there  are  limits  to  the  results  that  can  be  achieved  by  simulation, 
it  appears  that  no  extensive  simulation  experiments  have  been  made  to 
date  for  information  retrieval  systems.  Section  VI  describes  some  studies 
in  which  the  operation  of  several  retrieval  systems  was  simulated  over 
wide  ranges  of  operating  parameters,  using  the  model  described  in  the 
following  pages,  in  order  to  determine  the  operating  cost  for  particular 
problems . 

In  only  a  very  few  cases  does  a  simulation  model  truly  represent  the 
behavior  of  the  actual  system,  and  in  only  a  few  cases  can  it  be  ex¬ 
tended  or  generalized  to  describe  all  similar  systems.  The  model  des¬ 
cribed  below  was  designed  to  represent  the  operations  of  an  information 
storage  and  retrieval  system.  It  is  general  enough  to  be  applied  to  a 
spectrum  of  systems,  from  edge-punched  cards  to  large  retrieval  systems 
that  utilize  computers  or  other  complicated  digital  equipment.  Although 
the  model  is  not  so  general  that  it  can  include  any  retrieval  system  one 
may  elect  to  consider,  it  can  be  modified  to  include  additional  features. 
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B .  Description  of  the  Model 


The  model  shows,  in  general  form,  all  of  the  operations  required  to 
establish,  operate,  and  maintain  an  information  retrieval  system.  It  is 
divided  into  seven  different  parts,  each  of  which  is  relatively  inde¬ 
pendent.  The  seven  parts  of  the  model,  illustrated  in  flow  chart  form 
in  Figs.  1  through  7,  are: 

(1)  System  conversion  or  establishment 

(2)  Acquisitions 

(3)  Input 

(4)  Search 

(5)  Maintenance  of  the  indexing  Information 

(6)  Re-file  or  return  borrowed  material 

(7)  Handling  document  requests  and  inter-library  loan. 

The  model  was  also  used  as  a  basis  for  the  cost  analysis  programs  des¬ 
cribed  in  Sec,  VI. 
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FIG.  I  FLOW  CHAKT  FOR  SYSTEM  CONVERSION  OR  ESTABLISHMENT 
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start 


FIG.  2  FLOW  CHART  FOR  ACQUISITIONS 
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FIG.  3  FLOW  CHART  FOR  INPUT 
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FIG.  4  FLOW  CHART  FOR  SEARCH 


FIG.  5  FLOW  CHART  FOR  MAINTENANCE  OF  THE  FILE  AND  INDEXING  INFORMATION 
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start 


FIG.  6  FLOW  CHART  FOR  RE-FILE  OR  RETURN  BORROWED  MATERIAL 
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ART  FOR  DOCUMENT  REQUESTS  AND  INTER-LIBRARY 


VI  EVALUATION  TECHNIQUES 


A.  General 

As  stated  In  the  Introduction,  the  actual  evaluation  procedure 
utilizes  three  complementary  tools:  (1)  basic  criteria  or  screening 
procedures  to  describe  the  range  of  some  requirements  encountered  in 
operating  installations;  (2)  one  or  more  comprehensive  evaluation  pro¬ 
cedures  that  determine  how  well  the  performance  of  a  given  system 
satisfies  the  requirements  of  a  particular  population  of  users;  (3) 
two  cost  analysis  programs  that  determine  the  equivalent  annual  operating 
costs  of  a  retrieval  system  given  a  description  of  its  functional  charac¬ 
teristics.  These  three  tools  are  described  in  more  detail  in  the  follow¬ 
ing  sections . 

B .  Preliminary  Screening  for  Ranges  of  Requirements  of  Information 

Retrieval  Systems 

A  number  of  equipment  manufacturers  and  some  librarians  have  sug¬ 
gested  the  possibility  of  developing  "universal"  information  retrieval 
systems  that  could  generally  be  applied  to  any  problem.  In  order  to 
test  any  claims  of  universality,  some  data  must  be  available  to  describe 
the  range  and  distribution  of  the  parameters  of  the  "universal”  problems. 

To  be  completely  universal,  a  proposed  system  would  have  to  be  able  to 
accept  or  adapt  to  wide  ranges  in  the  file  size,  accession  rate,  search 
volume,  search  response  times,  indexing  complexity,  cost,  type  of  file 
material,  and  many  other  parameters  in  order  to  accommodate  the  practical 
range  of  real  problem  situations  that  exist.  This  section  of  the  report 
provides  some  information  about  the  distributions  of  a  few  problem  para¬ 
meters,  in  order  to  allow  some  estimates  to  be  made  of  the  degree  of 
universality  of  proposed  retrieval  systems.  Only  a  few  problem  para¬ 
meters  have  been  studied,  but  it  should  not  be  too  difficult  to  describe 
the  distributions  of  additional  parameters  with  a  moderate  amount  of  effort. 

Proponents  of  a  semi-universal  system  might  consider  applying  it  to 
specific  types  of  organizations  or  to  specific  subject  fields.  An  ex¬ 
ample  of  the  first  type  would  be  a  proposal  for  a  system  for  college  and 
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university  libraries,  or  for  public  libraries.  An  example  of  the  second 
type  would  be  a  proposal  for  a  system  for  the  handling  of  all  the  litera¬ 
ture  in  any  one  field  of  science  or  technology.  Some  background  informa¬ 
tion  to  assist  in  the  evaluation  of  such  general  proposals  is  given  in 
Figs.  0,  9,  10,  and  11. 

Figure  a  portrays  the  current  file  size  and  accession  rate  of  each 
of  the  U.S.  college  and  university  libraries  and  provides  information  on 
the  cumulative  distributions  of  these  parameters.  It  shows,  for  example, 
that  to  be  applicable  to  90  percent  of  the  U.S.  college  and  university 
libraries,  a  universal  system  would  have  to  have  the  storage  or  indexing 
capacity  for  at  least  200,000  file  items,  and  the  capability  for  accepting 
the  input  of  at  least  200  new  file  items  per  week  without  developing  a 
backlog . 

Figure  9  portrays  the  same  type  of  information  for  the  U.S.  public 
library  systems.  It  shows,  for  example,  that  to  be  applicable  to  90 
percent  of  the  U.S.  public  libraries,  a  universal  system  would  have  to 
have  the  storage  or  indexing  capacity  for  at  least  500,000  file  items, 
and  the  capability  for  accepting  the  input  of  at  least  630  new  file 
items  per  week  without  developing  a  backlog. 


For  the  purposes  of  this  study,  a  file  item  was  defined  as  any  printed, 
typewritten,  mimeographed,  or  processed  work,  bound  or  unbound,  that 
has  been  fully  catalogued  or  fully  prepared  for  use.  Microcards, 
microfilms,  and  other  forms  of  raicrotext  are  included.  The  accession 
rate  is  defined  as  the  actual  number  of  file  items  acquired,  and  does 
not  consider  the  file  items  withdrawn  or  purged  from  the  file. 

C 

The  public  library  systems  in  this  case  are  defined  as  collections  of 
individual  libraries  working  together  cooperatively  in  one  city  (e.g.. 
The  San  Francisco  Public  Library  System)  .  Presumably  the  control  of 
each  of  these  library  complexes  is  centralized  enough  to  consider  each 
single  library  system  as  a  candidate  for  a  single  information  retrieval 
system — and  not  consider  applying  retrieval  system.^  to  Individual 
libraries . 
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source:  library  ST&TIOrtCS  OF  CCLLEOES  ftNQ  UNIVEWStTlfcS.  l&S».>-60,  PART  1  :  tNSTITUTIONAl  DATA.  U  5  DEPT 
OF  HEALTH,  EDUCATION  AND  WELFARE  .  OFFICE  OF  EDUCATION ,  J  C.  RATHER  AND  DC-  MOLLAOAY,  REPORT 
0E-f5O2S  (1961) 


FIG.  8  U.S.  COLLEGE  AND  UNIVERSITY  LIBRARIES  -  FILE  SIZE  AND  ACCESSION  RATES 
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NUMBER  OF  ITEMS  IN  TI-IE  FILE 


SOURCE.  I.  SIAUSTICS  OF  PU8UC  LtBRAR>  SYSTEMS  IN  CITIES  WITH  POPULATIONS  OF  lOQ.QQQ  OR  MORE  :  FISCAL 

YEAR  1956.  U.'s  DCPT  OF  HEALTH ,  EOUCATJOM  fa  WELFARE;  OFFICE  OF  tDUCAnON,  CIRCULAR  590(JUNE  I9S91. 
2.  STATISTICS  OF  PUBLIC  LIBRARY  SYSTEMS  IN  CITIES  WITH  POPULATIONS  OF  SO, OOP  TO  99.999,  FISCAL 
YEAR  I9JJ8,'U.S  OEPt.  OF  HEALTH .  EOUCATIOH,  8  WELFARE;  OFFICE  OF  EDUCATION ,  CIRCULAR  594(JULY  1959). 

3  PUBLIC  LIBRARY  STATISTICS :  1944 -45  IFOR  CITIES  WITH  POPULATIONS  OF  2«,000  TO  49,999),  FLO. 
SECURITY  AOENC'f  i  OFFICE  OF  EDUCATION  ( 1947 ) 


FIG.  9  U.S.  PUBLIC  LIBRARY  SYSTEMS  -  FILE  SIZE  AND  ACCESSION  RATES 
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riG.  10  ACCUMULATED  FILE  SIZES  AND  CURRENT  ACCESSION  RATES  OF  THE  PUBLICATIONS 
OF  SEVERAL  ABSTRACTING  AND  INDEXING  SERVICES 
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INSTALLATION  NUMBER 


FIG.  11  RANGE  OF  SYSTEM  REQUIREMENTS  FOR  THE  NUMBER  OF  DESCRIPTORS  PER  DOCUMENT 
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Some  data  was  collected  to  describe  the  file  size  and  accession 
rates  of  industrial  research  libraries  of  different  types^  but  it  was 
not  complete  enough  to  allow  the  same  type  of  definitive  statements  to 
be  made  as  were  made  for  the  university  and  public  libraries. 

Proposals  have  been  made  for  establishing  mechanized  literature 
searching  systems  for  the  files  of  the  existing  abstracting  and  indexing 
services.  It  is  not  completely  unreasonable  to  suggest  that  a  new  re¬ 
ference  center  for  the  publishers  of  such  publications  as  Chemical 
Abstracts,  Index  Medicus,  or  the  Review  of  Metal  Literature  might  con¬ 
sider  encoding  and  including  all  of  the  citations  or  abstracts  that 
had  ever  been  prepared  by  them,  to  include  in  a  file  for  searching. 

For  that  reason,  data  were  collected  to  describe  the  total  warehouse 
of  citations  or  abstracts  that  had  ever  been  published  by  each  of 
several  indexing  and  abstracting  services,  to  show  the  amount  of  storage 
or  indexing  capacity  that  would  be  required  of  a  system  universally 
applicable  to  all  such  services.  Data  were  also  collected  to  describe 
the  required  accession  rates,  and  are  shown  in  Fig,  10.  No  cumulative 
distributions  are  shown  since  the  data  for  many  more  services  was  not 
available.  However,  any  universal  system  prepared  to  accommodate  the 
files  of  all  of  the  Indexing  and  abstracting  services  would  require  a 
storage  or  indexing  capacity  of  at  least  2.6  million  items,  and  a 
capability  for  accepting  the  input  of  at  least  2,900  new  file  items 
per  week  without  developing  a  backlog.  Appendix  E  gives  the  identities 
and  exact  figures  lor  the  data  shown  in  this  figure. 

The  indexing  and  abstracting  services  do  not  represent  the  total 
volume  of  literature  that  might  be  included  in  a  retrieval  system  since 
they  are  usually  restricted  in  their  degree  of  coverage  by  their  budget 
and  other  considerations.  Some  data  indicate,  for  example,  that  to 
handle  the  entire  volume  of  periodical  literature  for  the  fields  of 
medicine,  agriculture,  chemistry,  and  the  biological  sciences  might 
require  a  capability  for  accepting,  indexing,  and  storing  an  input  of 
approximately  220,000,  150,000,  150,000,  and  150,000  file  items  per  year, 
respectively .  ^ 


From  an  Indexing  standpoint^  a  universal  retrieval  system  would 
have  to  accommodate  a  large  variety  of  indexing  systems^  each  of  which 
could  be  implemented  with  varying  degrees  of  complexity.  It  must  lend 
itself  to  the  use  of  classification  and  indexing  systems  such  as  the 
following:  hierarchical  schemes  such  as  the  Universal  Decimal,  Dewey 
Decimal,  and  the  Library  of  Congress  classification  schemes;  a  variety 
of  coordinate  indexing  systems  and  their  variations  such  as  Uniterms, 
links  and  roles,  descriptors,  and  keywords;  faceted  classification 
schemes  such  as  those  proposed  by  Ranganathan,  Vickery,  and  others; 
and  more  complex  systems  such  as  the  Perry-Kent  system  of  telegraphic 
abstracting  and  indexing.  One  brief  illustration  of  the  indexing  capa¬ 
bility  required  of  a  universal  system  is  given  in  Fig.  11,  which  shows 

the  range  of  descriptors  or  Uniterms  required  for  each  file  item  in  a 

g 

number  of  actual  Installations  using  this  type  of  Indexing.  These 
data  suggest  that  such  a  system  would  require  the  capability  for  im¬ 
plementing  a  coordinate  indexing  system  with  at  least  50  descriptors 
per  file  item. 

Hopefully,  the  preceeding  discussion  provides  a  preliminary  basis 
for  accepting  or  rejecting  claims  of  the  universality  of  proposed  re¬ 
trieval  systems.  The  next  sections  describe  more  comprehensive  evalua¬ 
tion  techniques  that  have  been  developed  for  the  analysis  of  retrieval 
systems  proposed  for  specific  applications. 

C .  General  Performance  Evaluation 

Two  approaches  were  developed  to  obtain  a  measui’e  of  how  well  any 
specific  information  system  satisfies  the  requirements  of  the  users. 

The  first  method  matches  the  measured  performance  with  the  requirements, 
applies  weighting  factors  to  each  requirement,  and  determines  an  over¬ 
all  figure  of  merit.  The  second  method  utilizes  a  model  which  attempts 
to  reduce  all  the  requirements  and  performance  statements  to  the  common 
denominators  of  time  or  cost.  Both  of  these  methods  are  described  In 
more  detail  below. 
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1,. 


Performance-Requirement  Matching  with  Weighting 


This  procedure  was  developed  as  an  interim  tool  to  provide 
rough  performance  evaluations.  It  could  be  extended  to  become  a  more 
useful  tool;  however.  It  does  have  the  disadvantage  of  relying,  to  a 
certain  measure,  on  opinions  of  the  users.  There  is  also  another 
fundamental  problem  that  poses  a  stumbling  block,  and  that  is  the  ques¬ 
tion  of  developing  weighting  factors  that  describe  the  relative  impor¬ 
tance  of  each  of  the  system  requirements.  This  problem  is  discussed  in 
more  detail  at  the  end  of  the  description  of  the  first  performance 
evaluation  procedure . 

This  procedure  develops  a  measure  of  how  well  any  specific 
informatioii  system  satisfies  the  users'  requirements  by  matching  the 
measured  performance  with  the  requirements,  and  applying  proper  weight¬ 
ing  factors.  Certain  basic  information  about  the  system  and  the  users 
(shown  in  general  form  in  Fig.  12)  is  needed  for  this  evaluation: 

(1)  A  list  of  factors  or  considerations  that  are  normally 
called  "user  requirements"  (e.g.,  required  response 
time  and  false  drop  rate)  should  be  developed.  There 
is  no  fundamental  restriction  on  the  sequence  or  the 
number  of  requirements  that  can  be  entered  on  this 
list,  although  to  simplify  the  measurements  and  com¬ 
putation  the  list  may,  in  practice,  be  held  to  about 
ten  or  twelve  entries.  There  is  the  possibility  that 
once  a  master  list  of  requirements  has  been  estab¬ 
lished  and  tested,  It  may  be  useful  as  a  standard  for 
subsequent  evaluations. 

(2)  A  measure  of  the  relative  importance  of  each  of  the 
requirements  should  be  obtained  from  the  users  to  be 
served  by  this  system.  That  is,  a  weighting  figure 


This  is  discussed  further  in  the  subsequent  description  of  the  second 
performance  evaluation  method. 
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FIG.  12  WORKSHEET  FOR  PERFORMANCE  EVALUATION  -  GENERAL  CASE 


for  each  requirement  should  be  obtained  that  reflects 
the  relative  importance  of  that  requirement  to  the 
users  being  served.  It  is  quite  likely  that  different 
groups  of  users  will  rank  or  weight  the  requirements 
differently.  However^  after  enough  measurements  have 
been  made  of  representative  user  groups,  it  may  be 
possible  to  arrive  at  empirical  rule-of-thumb  weightings 
or  design  guidelines  that  could  be  used  for  most  sub¬ 
sequent  evaluations.  The  weightings  would  also  be  useful 
to  equipment  and  system  designers,  to  aid  in  the  develop¬ 
ment  of  systems  that  more  nearly  satisfy  the  users' 
problems . 

(3)  For  each  requirement  listed,  measurements  should  be 

made  to  quantitatively  describe  the  users'  requirement. 
For  some  requirements  (e.g.,  the  ease  of  communication 
with  the  system)  it  may  be  extremely  difficult  or  im¬ 
possible  to  obtain  any  measurements,  and  consequently 
it  will  be  impossible  to  measure  how  well  the  system 
satisfies  the  user  requirement.  But  although  one  can 
not  obtain  a  quantitative  measure  of  how  well  the  pro¬ 
posed  system  satisfies  this  requirement,  the  analyst 
will  at  least  know  the  relative  importance  of  this  re¬ 
quirement  and  can  treat  it  separately.  In  the  same 
manner  as  the  users  weighting  of  the  requirements,  the 
actual  measurements  of  the  requirements  may  differ  among 
different  groups  of  users.  However,  there  is  the  possi¬ 
bility,  just  as  with  the  requirement  ranking,  that  after 
enough  measurements  have  been  taken  from  representative 
groups,  it  may  be  possible  to  arrive  at  general  guide¬ 
lines  or  standards  that  could  be  adapted  for  subsequent 
evaluations,  thus  eliminating  the  need  for  more  measure¬ 
ments,  The  measurements  and  rankings  of  the  requirements 
could  be  used  as  specifications  or  design  goals  for  the 
equipment  and  system  designers. 

R.T 


(4)  For  each  requirement  listed,  the  proponent  of  each 

candidate  system  being  evaluated  must  provide  data  to 
describe  its  performance  for  this  particular  parameter. 
To  simplify  the  evaluation,  these  data  should  be  in  the 
same  form  as  the  measurements  of  the  user  requirements — 
that  is,  the  same  coordinates  and  scales. 

The  evaluation  procedure  then  consists  of  the  following  operations  (see 
the  sample  worksheet  in  Figure  12)  for  a  given  candidate  system: 

(1)  For  the  first  requirement  on  the  list,  determine 
the  measure  of  agreement  between  the  system  per¬ 
formance  and  the  user  requirement.  The  detailed 
procedure  for  obtaining  this  measure  of  agree¬ 
ment  is  given  in  Appendix  B. 

(2)  For  the  same  requirement,  multiply  the  measure 
of  agreement  by  a  weighting  coefficient  that 
represents  the  relative  importance  of  that  re¬ 
quirement,  and  record  the  resulting  score  for 
this  requirement, 

(3)  Repeat  the  first  two  steps  for  each  of  the  re¬ 
quirements  on  the  list.  When  these  operations 
have  been  performed  on  all  of  the  requirements, 
then  add  up  all  the  weighted  scores  to  arrive 
at  the  total  score- -which  is  a  single  figure 
of  merit. 

The  actual  performance  of  any  system  will  depend  to  a  certain 
extent  upon  parameters  such  as  the  file  siao,  the  accession  rate,  and 
the  volume  of  search  requests.  Consequently,  the  performance  figures 
given  for  a  specific  analysis  may  not  be  applicable  to  the  entire  range 
of  '/ariatlons  in  the  operating  environment.  It  is  also  unlikely  that 
any  single  figure  of  merit  will  have  the  sa/iie  value  for  all  different 
operating  environriients .  For  this  reason,  it  may  be  advantageous  to 
compute  a  set  of  performance  figures  for  different  sets  of  environments 
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so  that  a  candidate  system's  evaluation  can  take  place  over  a  range  of 
operating  situations.  It  might  be  useful  to  compute  and  display  a  set 
of  performance  evaluations  in  a  manner  similar  to  the  cost  analysis 
procedure  described  in  a  subsequent  section. 

2.  Performance  Evaluation  With  a  Time-Cost  Model 

The  first  procedure  could  be  implemented  now^  on  an  interim 
basis.  Although  this  second  procedure  would  require  considerably  more 
development  before  it  can  be  useful,  it  does  show  promise^  as  an  evalua¬ 
tion  procedure.  One  major  objection  to  the  first  procedure  is  the  weak¬ 
ness  that,  to  a  certain  extent,  it  measures  user  requirements  by  sampling 
opinion.  We  ask  the  user  to  select  trom  a  limited  number  of  choices, 
values  of  certain  characteristics  that  in  some  sense  satisfy  his  needs — 
instead  of  formulating  document  retrieval  system  models  that  tie  user 
requirements  and  system  charactei'istics  to  service  and  cost.  Opinion 
sampling  is  oftc;n  the  only  way  of  proceeding  where  information  cannot 
be  obtained  analytically.  However,  where  an  analytical  approach  is 
possible,  opinions  should  be  subordinated  to  facts  (i.e.,  a  poll  of 
stock  clerks  is  not  a  valid  basis  for  designing  an  inventory  control 
system)  .  A  model  of  the  system  should  still  be  constructed,  but  it 
should  be  a  model  from  which  we  could  derive  optimal  procedures. 

During  the  cour.se  of  this  project,  we  have  developed  a  frame¬ 
work  for  de.scritaing  a  document  retrieval  system  In  terms  of  cost  and 
service.  Although  there  are  many  formidable  problems  Involved  in  apply¬ 
ing  this  model,  it  is  felt,  to  be  sti’ucturally  sound.  Its  iny’j.;  are 
measurements  of  performance  and  costs  rattier  than  the  opinions  of  po¬ 
tential  users.  A  preliminary  description  of  this  approach  given  in 
a  subsequent  section  of  this  report. 


* 

See  Fig.  16,  p.  73,  for  an  illustration  of  such  a  display. 


Comments  on  the  Performance  Evaluation  Procedures 


3  . 

For  immediate  and  rough  measures  of  performance^  the  first 
method  and  its  associated  interview  guide  would  appear  to  be  the  most 
appropriate.  For  future  evaluations,  the  second  method  with  further 
development  might  be  more  appropriate.  Neither  method  has  been  tested 
with  representative  systems,  and  both  could  use  considerably  more  study 
and  development. 

D .  Cost  Analysis 

i 

1  .  General  Form 

One  of  the  early  plans  of  this  project  was  to  develop  a  computer 
program  to  take  tho  model  flow  charts  (Figs.  1  through  7)  and  all  the 
necessary  accompanying  information  to  describe  the  labor,  equipment, 
material,  and  other  requirements  for  each  of  the  functional  boxes  shown 
in  the  flow  charts — and  simulate  the  operation  of  defined  information 
systems.  However,  because  of  the  short  duration  of  the  project  and  the 
unavailability  of  the  necessary  time  and  cost  information  for  most  of 
the  basic  operations  shown  on  tho  charts,  it  was  necessary  to  resort  to 
a  much  simpler  program.  Aa  actually  written  and  used,  the  program  accepts 
summary  statements  about  each  of  the  seven  basic  parts  and  uses  this 
information  io  compute  an  annual  operating  cost  for  the  system  under 
study.  For  analysis  purposes,  the  flow  charts  are  studied  in  the  con¬ 
text  of  the  paiLicular  system  being  studied,  and  serve  as  a  checklist 
and  a  workshcot.  Blocks  on  the  charts  that  do  not  apply  to  the  system 
Ijeing  studied  are  crossed  out,  and  the  remaining  blocks  arc  studied  by 
a  knowledgeable  person  to  detormine  the  labor,  uquipment,  and  material 
required  to  perform  that  function. 

The  present  program  accepts  the  following  input  data  for  sub- 
soQUoiit  processing; 

Co.st  figures 

(t)  Wage  rates  for  each  of  20  different  labor 
ca  tcgorics 
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(2)  Purchase  or  lease  costs  for  each  of  40  different 
pieces  of  equipment  (first,  second,  and  third 
shift  costs) 

(3)  Material  costs  for  each  of  20  different  types 
of  materials 

(4)  Costs  for  each  of  20  miscellaneous  items 

Cost  functions 

(1)  Statements  that  are  functions  of  the  file 
accession  rate 

(2)  Statements  that  are  functions  of  the  volume 
of  search  requests 

(3)  Statements  that  are  functions  of  the  mis¬ 
cellaneous  relationships 

Constants 

(1)  Initial  file  size 

(2)  Amortization  period 

(3)  Rate-of-return  to  be  used  for  amortization 
calculations 

(4)  Burden  percentage 

(5)  Overhead  percentage. 

The  present  program  assumes  that  a  whole  nirmbcr  of  people  will 
be  used,  so  that  each  fraction  of  a  type  of  laborer  is  rounded  off  to 
the  next  higher  integer.  Kach  particular  typo  of  laborer  (e.g.,  clerk) 
Vvorks  on  any  task  that  requires  that  labor  type.  Similarly,  only  whole 
numbers  of  equipment  units  will  be  used,  so  that  the  program  will  always 
charge  the  full  cost  of  one  computer,  even  though  the  computer  may  only 
be  required  for  two  hours  per  clay. 

Using  the  statements  about  the  amount  of  each  type  of  labor 
required  to  process  the  input  items,  conduct  the  searches,  and  perform 
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all  the  other  necessary  tasks^  the  program  determines  the  total  amount 
of  each  type  of  direct  labor  required.  The  program  then  estimates  the 
amount  of  each  type  of  indirect  labor  required^  such  as  managers.  At 
present,  the  program  adds  a  manager  when  the  total  working  staff  reaches 
5  persons,  and  adds  an  assistant  manager  for  each  increment  of  20  per¬ 
sons  after  that.  The  salaries  of  all  the  direct  and  indirect  labor 
types  are  then  totaled  to  determine  the  basic  labor  charge.  The  burden 
(allowances  for  vacation,  sick  leave,  social  security,  etc.)  and  the 
overhead  charges  are  then  used  in  a  standard  accounting  manner  to  arrive 
at  the  total  loaded  labor  costs. 

Next,  the  equipment  requirements  are  determined,  based  on  the 
capacity  of  each  of  the  individual  units  of  equipment.  Thfe  program 
accepts  each  piece  of  equipment  on  a  lease  or  purchase  basis — as  defined 
by  the  input  data.  Lease  charges  are  considered  to  be  a  simple  monthly 
cost.  Purchased  equipment  is  amortized  over  a  time  interval  and  at  an 
interest  rate  specified  by  the  input  data.^°  To  simplify  the  program, 
the  annual  rate  of  return  was  divided  by  12  to  get  a  nominal  (not 
effective)  monthly  rate  of  return  which  was  then  used  to  determine  a 
uniform  monthly  payment.  If  the  rate  of  return  is  set  at  zero  percent, 
as  done  in  most  cursory  economy  studies,  then  the  cost  of  the  equipment 
is  simply  divided  equally  among  the  specified  time  intervals  without 
considering  Lhe  time  value  of  raonej'. 

After  the  material  and  other  miscell areous  costs  are  deter¬ 
mined,  a  final  total  is  obtained,  on  a  monthly  basis,  for  the  entire 
system.  An  annual  total  is  then  determined,  and  this  is  the  figure 
that  is  printed  out  by  the  computer.  This  set  of  computations  has  been 
done  for  a  prescribed  initial  file  size  and  a  specified  accession  rate 
and  volume  of  search  requests.  The  computations  are  then  repeated,  using 
diffei'ent  sets  of  accession  rates  and  search  volumes  as  prescribed  by 
the  input  data,  to  prepare  the  i-emaining  entries  for  the  printed  table. 

It  might  also  be  mentioned  that  the  initial  file  size  is  gi'owing 
at  the  prescribed  monthly  accession  rate.  Conseqviently ,  the  labor  and 
equipment  costs  usually  increase  for  each  subsequent  month's  operation. 


It  was  primarily  for  this  reason,  and  to  make  the  analysis  as  realistic 
as  possible  that  costs  were  computed  on  a  monthly  basis  and  then  totaled 
for  the  year.  For  each  month,  the  computations  use  the  total  file  size 
that  had  accumulated  at  the  beginning  of  that  month. 

Many  of  the  procedures,  such  as  the  method  of  accounting  for 
overhead  charges,  were  built  in  as  a  part  of  the  main  program.  However, 
it  would  be  relatively  simple  to  modify  or  change  these  procedures  if 
necessary . 

An  evaluation  of  three  representative  information  systems  was 
made  with  this  program.  The  computer  presentation  of  the  results  is  a 
table  of  the  form  shown  in  Figs.  13  through  15.  It  is  useful  to  plot 
these  results  in  the  form  shown  in  Fig.  16  to  allow  a  direct  comparison 
to  be  made  of  the  economics  of  candidate  systems  over  wide  ranges  in 
operating  parameters.  The  sample  comparison  shown  in  Fig.  16  illustrates 
which  system,  from  an  economic  viewpoint,  is  most  favorablt-  over  a  given 
operating  region.  With  this  program,  relatively  accurate  cost  analyses 
of  proposed  systems  can  be  made  without  actually  implementing  a  full- 
scale  or  pilot  operation  of  the  proposed  system.  In  addition  to  serving 
as  part  of  an  evaluation  procedure  for  proposed  retrieval  systems,  the 
model  can  also  be  used  effectively  as  a  research  tool  to  determine  the 
effects  of  varying  the  parameters  and  over-all  system  design.  It  can 
also  be  used  to  test  proposed  systems  that  have  no  counterpart  in  any 
existing  installation. 

it  is  not  our  intent  in  Fig.  16  to  show  that  one  system  is 
better  than  another.  For  this  reason,  wc  have  omitted  any  detailed 
description  or  identification  of  these  systems  in  this  report.  Con¬ 
sidering  the  preliminary  nature  of  our  time  and  performance  data,  com- 
pai’lson  would  be  unfair  to  all  three  systems.  We  merely  want  to 
demonstrate  that  evaluation  procedures  were  developed  that  could  produce 
this  type  of  Infoimatlon .  The  credibility  of  the  analysis  depends  in 
large  measure  on  the  accuracy  of  the  basic  time  and  cost  data — which 
accuracy  in  our  sample  evaluations  is  highly  suspect.  It  is  quite 
possible  that  a  system  of  pre-determinecl  times  for  standard  elemental 
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FIG.  15  COST  CALCULATIONS  FOR  VIDEO  TAPE  SYSTEM  A  -  GENERAL  CASE 


NOTE  :  All  three  ststems  started  with  an  initial  file  size  of  zero  items. 


FIG.  16  ANNUAL  COST  OF  THREE  STORAGE  AND  RETRIEVAL  SYSTEMS  -  GENERAL  CASE 


operations  could  be  used  for  many  of  the  operations  of  any  proposed 
retrieval  system.  However,  accurate,  standardized  times  do  not  exist 
and,  to  our  knowledge,  no  concerted  effort  is  being  made  by  any  or¬ 
ganization  to  develop  such  data.  The  unit  times  and  costs  used  in 
our  sample  evaluations  were  crude  estimates,  and  should  not  be  used 
for  evaluation  purposes  without  first  determining  their  accuracy.  As 
discussed  later  in  Sec.  VII,  more  research  directed  toward  the  develop¬ 
ment  of  elemental  times  and  costs  for  basic  operations  performed  in 
documentation  and  information  retrieval  would  be  helpful. 

2.  Cash  Flow  and  Other  Computations  for  Specific  Problems 

This  program  was  written  for  use  in  those  cases  when  more 
detailed  Information  is  available  to  describe  the  future  problem  para¬ 
meters  for  a  given  user.  For  example,  an  abstracting  or  indexing  service 
that  is  considering  the  establishment  of  a  literature-searching  system 
would  have  a  fairly  accurate  idea  of  what  accession  rate  and  volume  of 
search  requests  it  will  encounter  during  the  next  few  years  of  operation. 
In  this  type  of  situation,  the  user  desires  to  compare  the  costs  of 
candidate  systems  for  his  particular  problem.  That  is,  he  wants  to 
compare  the  expenditures  of  each  candidate  over  some  specified  time,  say 
five  or  ten  years.  This  program  accepts  the  same  basic  information  as 
the  general  cost  analysis  program,  and  prints  a  total  monthly  operating 
cost  for  each  month  in  a  10-year  period,  as  illustrated  in  Figs.  17,  18, 
and  19.  When  plotted,  this  results  in  a  graphic  portrayal  of  the 
monthly  expense  cash  flow  for  a  particular  system  over  a  10-year  period. 
Figure  20  illustrates  the  cash  flow  for  three  candidate  systems. 

Given  the  casli  flows  for  several  alternative  systems,  we  need 
some  method  of  choosing  the  most  attractive  candidate.  In  cases  where 
the  curves  completely  overlap  each  other,  the  choice  is  simple.  How¬ 
ever,  where  the  curves  intersect  at  some  time  in  the  future,  the  choice 
is  not  simple,  and  must  consider  the  time  value  of  money.  T'.vo  methods 
of  comparison  which  are  useful  in  such  situations  are  the  ’’present  worth" 
and  "equivalent  annual  cost."  The  present-worth  method  determines  the 
present  worth  of  a  time  sequence  of  expenses.  That  is,  it  detennines 
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COST  CALCULATIONS  (IN  DOLLARS  PER  YEAR) 


edge-notchfd  card  system 

INITIAL  file  size 
rate  of  return  ,070 
AMORTIZATION  PERIOD  S.OO 
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FIG.  17  COST  CALCULATIONS  FOR  EDGE-NOTCHED  CARD  SYSTEM  A  -  SPECIFIC  PROBLEM 
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COST  CALCiJLAT  IONS  (IN  DOLLARS  PFR  YFAR) 
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FIG.  13  COST  CALCULATIONS  FOR  COMPUTER  SYSTEM  A  -  SPECIFIC  PROBLEM 
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COST  CALCULATIONIi  (fN  DOLLARS  PEK  YEAR) 


VIDEO  TAPE  SYSTEM  A 
INITIAL  KILE  SIZE 
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FIG.  19  COST  calculations  FOR  VIDEO  TAPE  SYSTEM  A  -  SPECIFIC  PROBLEM 
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YE4R 

FIG,  20  MONTHLY  OPERATING  COSTS  OF  THREE  STORAGE  AND  RETRIEVAL  SYSTEMS  - 
SPECIFIC  PROBLEM 
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how  much  money  would  have  to  be  put  In  the  bank  today  to  exactly  meet 
the  prescribed  series  of  payments  over  the  coming  10-year  period  at  a 
given  Interest  rate.  The  candidate  system  with  the  lowest  present- 
worth  figure  is  obviously  the  most  attractive  from  an  economic  stand¬ 
point.  The  second  comparison  method  determines  an  equivalent  annual 
cost  over  a  specified  number  of  years.  In  this  way,  a  series  of  un¬ 
equal  monthly  costs  over  a  10-year  period  could  be  converted  to  an 
equivalent  annual  cost.  Obviously  the  system  with  the  lowest  annual 
cost  is  the  most  attractive  from  an  economic  standpoint.  The  program 
computes  both  a  present  worth  and  nn  equivalent  annual  cost  for  each 
candidate  and  includes  this  in  the  printout  shown  in  Figs,  17,  10,  and 
19.  These  values  are  computed  for  1-,  2-,  3-,  .  .  .  10-year  periods, 
so  that  systems  can  be  directly  compared  for  any  operating  period  from 
1  to  10  years.  In  the  examples  shown,  the  card,  computer,  ami  video 
tape  systems  have  present  worths  of  $090,434,  $1,423,552,  and  $1,723,070, 
respectively,  when  figured  over  a  5-year  operating  period.  Over  this 
interval,  the  card  system  would  be  the  most  attractive  choice  from  an 
economic  standpoint.  Over  a  10-year  operating  period,  the  card,  computer, 
and  video  tape  systems  would  have  present  worths  of  $4,599,227,  $2,130,391, 
and  $3,550,064,  respectively.  This  would  indicate  that  over  a  10-year 
operating  period  the  com.puter  system  would  be  the  most  attractive  choice. 

In  the  examples  shown,  the  cost  analysis  programs  only  con¬ 
sidered  a  time  series  of  generally  unequal  debits  (expenses) ,  However, 
the  programs  could  also  accommodate  an  accompanying  time  series  of 
generally  unequal  credits  (income)  to  arrive  at  a  net  present  worth  or 
annual  cost . 


This  equivalent  annual  cost  should  not  be  confused  with  the  actual 
annual  costs.  The  equivalent  annual  cost  is  obtained  by  extending, 
for  an  N-year  period,  the  present  worth  in  equal  annual  payments, 
considering  some  specified  interest  rate. 
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E .  The  Utility  of  the  Evaluation  Procedures 

The  coarse  screening  procedures  can  be  used  Immediately  to  give 
some  indications  of  how  the  capabilities  of  a  particular  system  may  fit 
in  the  range  of  some  of  the  variables  that  will  be  encountered  by  storage 
and  retrieval  systems.  Since  only  a  few  variables  have  been  studied  to 
datOj  the  collected  information  provides  only  a  cursory  screening.  How- 
ever^  the  procedure's  usefulness  can  be  improved  by  the  collection  of 
more  data;  there  is  no  fundamental  reason  why  this  approach  cannot  be 
extended  to  cover  more  parameters . 

The  performance  evaluation  procedure  that  matches  performance  with 
requirement,  and  includes  relative  weightings  for  each  requirement, 
could,  be  used  as  an  interim  tool.  However,  it  has  some  basic  limitations, 
and  it  requires  some  specific  information,  about  the  intended  user  popula¬ 
tion  hofore  it  can  be  used.  The  basic  ob.ioctions  and  limitations  of  this 
procedure  are;  (1)  to  a  large  measure  it  relies  upon  opinions  stated 
by  users  who  are  conditioned  to  their  present  systems,  so  that  the  pro¬ 
cedure  never  really  separates  need  from  habit;  (2)  there  are  basic 
theoretical  problems  in  deriving  a  single  weighting  factor  for  each 
requirement.  Even  if  these  limitations  are  accepted,  as  they  probably 
would  be  for  an  interim  application,  some  additional  data  must  be  collected 
before  the  procedure  can  be  used.  Specifically,  statements  and  measure¬ 
ments  of  the  requirements  and  their  relative  weightings  must  be  obtained 
for  the  Intended  user  population.  It  is  possible  that  continued  develop¬ 
ment  of  this  procedure  would  provide  some  answers  to  the  stated  objections. 

The  performance  evaluation  procedure  that  uses  a  model  to  reduce 
each  requirement  to  the  common  denominator  of  time  or  cost  would  seem 
to  be  a  potentially  useful  tool.  However,  it  will  require  considerably 
more  development  before  it  can  be  considered  to  be  a  useful  tool. 

Basically,  the  approach  seems  to  be  very  sound,  and  bypasses  the  objec¬ 
tions  stated  for  the  first  evaluation  procedure. 

Both  of  the  cost  analysis  programs  could  be  applied  immediately  if 
the  basic  operating  data  and  descriptions  were  available.  The  approach 
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is  basically  sound^  and  can  be  improved  even  further  with  additional 
effort.  However,  to  ensure  fair  accuracy  in  an  actual  evaluation,  basic 
data  and  operating  procedures  would  have  to  be  applied  in  more  detail 
than  they  currently  exist. 
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VII  PROBLEM  AREAS  AND  SUGGESTIONS  FOR  FURTHER  RESEARCH 


This  section  provides  some  suggestionFJ  for  further  research  for 
the  longer-range  development  of  more  basic,  and  exhaustive  criteria  and 
methods  for  the  assessment  of  alternative  systems  and  procedures.  The 
research  results  described  in  this  report  were  the  results  of  a  relatively 
brief  study  aimed  at  the  development  of  rough  measures  of  worth  for 
candidate  systems.  A  need  still  exists  for  the  development  of  a  longer- 
range  research  effort  aimed  at  Improving  the  methodology  for  comparison 
of  information  systems.  Such  research  would  ultimately  result  also  In 
a  better  understanding  of  the  role  of  information  systems  in  increasing 
scientific  productivity. 

The  following  general  ai-eas  should  be  considered  in  any  future  re¬ 
search  program  lor  evaluation  procedures:  (1)  development  of  methudology 
for  determining  user  requirements;  (2)  determination  of  elemental  times 
and  costs  of  the  basic  operations  performed  in  storage  and  retrieval 
systems;  (3)  development  and  use  of  modelling  for  performance  evalua¬ 
tion;  (4)  development  and  use  of  modelling  for  analysis  of  operating 
costs;  (5)  pilot  tests  or  evaluations  of  representative  systems;  (6) 
additional  basic  studies. 

A .  Methodology  for  Determ 1 n Ing  User  Requirements 

Additional  work  should  be  done  to  develop  and  improve  methods  for 
deto.rmining  user  requirements.  This  has  been  an  extremely  difficult 
study  methodologically — some  problems  have  been  attacked  successfully 
but  many  others  remain.  The  problem  of  classifying  criteria  should  re¬ 
ceive  further  attention.  The  criteria  might  be  classified  in  some  manner 
by  the  type  of  person  affected  (e.g.,  system  manager,  operator,  or  user) 
or  by  the  basic  conceptual  units. 

Further  work  should  be  done  to  distinguish  between  the  needs  of 
the  user  and  habits  conditioned  by  his  particular  environment.  Intuitively, 
one  would  expect  that  for  a  given  task  the  user's  needs  for  information 
would  be  the  same,  regardless  of  his  organizational  affil.iation  and  the 
facilities  available  to  him.  Thus,  need  should  not  be  confused  with 
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habit.  However,  it  does  seem  to  be  true  that  the  way  in  which  the  user 
expresses  his  needs  is  conditioned  by  the  present  facilities  available 
to  him.  It  might  be  more  desirable,  although  perhaps  more  difficult, 
to  present  the  user  with  a  set  of  specifications  designed  to  test  and 
measure  the  Importance  of  various  criteria.  The  following  test  is  an 
oversimplification,  but  it  does  indicate  the  approach: 

"Which  of  the  following  would  be  more  suitable  for  you: 

a.  A  system  which  would  provide  references  within 
24  hours  with  50  percent  irrelevant  references . 

b.  A  system  which  would  provide  references  within 
one  week  with  virtually  no  irrelevant  material," 

The  difficulty  of  the  method  is  to  keep  the  number  of  situations  presented 
to  the  user  within  bounds  and  still  test  the  required  number  of  criteria. 

Some  attention  should  be  given  to  the  measurement  of  requirements 
that  were  not  considered  during  the  preliminary  study,  as  well  as  to  the 
refinement  of  some  of  the  measurements  that  have  already  been  made. 

Perhaps  this  might  be  coupled  with  a  measurement  of  the  requirements 
of  a  particular  user  population  that  is  considering  the  installation 
of  some  comprehensive  information  services. 

B.  Determination  of  Elemental  Times  and  Costs 

As  mentioned  earlier,  the  accuracy  of  the  models  and  the  cost 
analyses  depends,  in  large  measure,  on  the  accuracy  of  the  basic  time 
and  cost  data.  An  operations-analysis  study  of  several  operating  systems 
to  develop  a  collection  of  realistic  time  and  cost  factors  for  the  basic 
functional  elements  would  be  very  helpful  for  the  modelling  operations, 

C.  Modelling  for  Performance  Evaluation 

1 ,  General 

The  selection  of  a  document  retrieval  system  (DRS)  ultimately 
depends  on  choosing  a  combination  of  cost  and  service  that  best  meets 
stated  requirements.  The  budget  restraint  is  Important. 
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Suppose  It  were  possible  to  determine  the  cost  of  a  DRS  and 
measure  the  service  it  pi’ovided.  Then  eac:h  cost-service  combination 
could  be  plotted  as  a  point  on  the  following  graph; 


Notice  that  for  a  given  cost,  one  DRS  gives  the  best  service.  The  point 
that  marks  this  DRS  is  called  the  efficient  point,  and  the  curve  through 
the  efficient  points  is  called  the  efficient  curve. 

An  analogy  may  help  to  clarify  this  idea.  Before  buying  an 
automobile  It  is  convenient  to  separate  the  available  cars  into  classes 
according  to  cost:  e.g.,  compacts  and  luxury  cars.  The  choice  of  a 
car  within  a  cost  class  would  then  depend  only  on  the  service  provided. 
It  may  be  difficult  to  measure  certain  aspects  of  service  (e.g.,  what 
is  the  value  of  a  quiet  ride?),  but  if  this  could  be  done,  then  there 
would  be  a  car  in  each  cost  class,  the  efficient  car,  which  would  give 
the  best  service. 

To  construct  a  DRS  efficient  curve  then,  it  is  necessary  to 
compute  service  and  cost  data  for  each  choice.  This  computation  will 
require  some  experimentation  and  observation  of  the  system  under  actual 
operating  conditions.  In  the  case  of  Installed  systems,  this  may  not 
be  difficult  to  do.  But  for  proposed  or  prototype  systems  such  study 
may  be  difficult  and  expensive.  It  may  be  sufficient  to  estimate  some 
components  of  cost  and  service  by  observing  operating  systems  similar  to 
the  proposed  system.  Other  components  however,  will  have  to  be  derived 
from  engineering  specifications  and  educated  guesses. 
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There  have  been  suggestions  on  how  to  conduct  experiments  to 
2 

obtain  data  on  DRS  service.  The  chief  difficulty  appears  to  be  the 
co,st  of  such  experimentation. 

2 .  Service 

The  purpose  of  a  DRS  is  to  satisfy  a  user' s  request  for  infor¬ 
mation.  In  any  DRS,  the  cost  in  time  of  providing  this  service  is  com¬ 
posed  of  four  parts:  (1)  the  time  to  prepare  input  requests,  (2)  the 
time  to  obtain  the  output  documents,  (3)  the  time  to  read  the  output 
documents,  and  (4)  the  time  to  reformulate  and  reprocess  the  request 
if  the  first  search  is  unsuccessful,  or  the  time  needed  to  search  else¬ 
where  if  the  information  is  not  in  the  file. 

There  are  t’vo  "kinds"  of  time  involved  in  DRS  service.  First 
there  is  the  time  when  the  user  fox’mulwtes  requccts  and  reads  output 
documents,  but  when  the  DRS  is  free  to  operate  on  other  search  requests. 
Second  there  is  the  time  when  the  DRS  operates  on  the  request  but  when 
the  user  is  free  to  do  other  things.  It  seems  clear  that  from  the  user's 
standpoint  a  minute  of  the  first  kind  of  time  Is  not  the  same  as  a  minute 
of  the  second  kind  of  time,  unless  the  user  has  only  one  job  to  do  and 
cannot  proceed  with  that  job  until  he  receives  the  search  results.  In 
this  situation,  total  user  time  is  the  elapsed  time  from  the  moment  the 
request  is  formulated  until  the  information  is  obtained. 

But  if  there  are  other  things  the  user  can  do  during  the 
machine  search,  then  his  waiting  time  is  not  wasted  and  total  user  time 
is  only  the  time  ho  spends  directly  in  the  search  effort.  In  this 
circumstance,  there  may  be  no  significant  difference  in  service  provided 
by  a  DRS  which  completes  a  search  in  a  minute  and  one  that  takes  a  week. 
However,  there  are  indication.s  that  the  performance  of  the  individual 
drops  as  much  as  25%  on  these  alternative  tasks  when  he  is  waiting  for 
information.^^ 

To  cemp^te  total  service  time  it  is  necessary  to  convert  DRS 
search  time  to  user  participation  time.  This  can  only  be  done  through 
knowledge  of  the  work  habits  of  the  population  using  the  DRS. 
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In  summary,  total  service  time,  T,  is 


T  = 


+  XT 


2 


where 

T^  =  user  time  expended  in  preparing  search  requests  and 
analyzing  search  output 

=  DHS  search  time 

X  -  a  factor  (o  <  X  ^  1)  for  converting  DBS  search  time 
to  user  time  (represents  the  degree  to  which  the 
user  is  Idle  or  inefficient  while  the  search  is 
being  conducted) . 

The  conversion  problem;  i.e.,  setting  the  value  of  X,  involves  u  judge¬ 
ment  or  measurement  by  someone  farnllim  with  the  particular  library 
and  group  of  users  under  analysis. 

Total  service  time  over  some  time  span  (e,g.,  one  year)  is 
the  product  of  the  average  service  time  per  search  and  the  amount  of 
library  use.  The  latter  statistic  is  usually  known.  The  former  can 
be  estimated  through  detailed  analysis  of  the  functions  performed  between 
the  moment  a  need  for  information  is  defined  until  this  need  is  filled. 
Insofar  as  types  of  functions  can  be  separated,  the  first  four  functions 
discussed  below  are  DRS  functions,  the  remaining  three  are  user  functions, 

3 .  Communication  of  the  Request 

Normally  a  request  for  information  is  first  phrased  in  the 
user's  natural  language.  Therefore,  the  request  must  be  converted  into 
a  form  acceptable  to  the  DRS  and  then  entered  onto  the  standard  input 
medium,  such  as  punched  cards.  The  conversion  can  be  done  mechanically 
or  by  human  beings.  In  either  case,  the  average  time  to  completely 
translate  a  request  into  a  suitable  input  form  can  be  estimated  by  direct 
observation . 
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4. 


File  Search 


Searching  can  be  done  mechanically  or  by  human  beings.  The 
estimated  search  time  should  include  all  the  time  from  the  moment  the 
coded  user  request  is  available  to  the  time  the  search  is  completed. 

If  the  DRS  batches  requests  before  .searching  begins,  then  waiting  time 
is  a  part  of  search  time.  Similarly,  if  the  output  is  batched  before 
it  is  distributed,  then  this  waiting  time  also  must  be  Included  in  the 
total.  File  search  time  can  be  estimated  by  observing  system  perfor¬ 
mance  on  a  sample  of  requests. 

3 .  Document  Retrieval 

If  the  output  consists  of  citations  or  document  numbers,  then 
i  1.  will  be  necessary  to  obtain  the  document  itself  or  an  abstract  of 
it.  For  some  DRS  this  task  is  incorporated  in  the  file  search;  for 
others  it  will  require  another  search  and  consequently  more  time.  If 
the  DRS  does  not  produce  the  document  itself,  then  the  time  required 
by  the  user  to  get  the  document  will  also  have  to  be  considered.  Docu¬ 
ment  retrieval  time  can  be  estimated  from  a  sample  of  searches, 

6 .  Document  Duplication 

In  many  systems,  copies  must  be  made  of  the  retrieval  documents. 
The  average  time  for  duplicating  output  is  easily  computed.  If  the  DRS 
output  is  the  document  itself  and  not  a  copy  (for  example,  if  the  output 
is  a  book  from  a  library  shelf),  then  other  users  who  have  need  for  the 
document  will  have  to  wait  until  it  gets  back  into  circulation.  This 
waiting  time  is  harder  to  estimate. 

7 .  Rejection  of  Nonrelevant  Material 

It  is  likely  that  the  output  of  a  DRS  search  will  contain 
irrelevant  documents  (false  drops).  The  fa.lse  drops  must  be  read  to 
determine  that  they  are  irrelevant,  and  this  takes  time;  the  greater 
the  number  of  false  drops,  therefore,  the  greater  the  time  wasted 
reading  them.  Reading  time  depends,  to  a  great  extent,  on  the  length 
of  the  document.  From  this  standpoint,  an  output  consisting  of  titles 
and  abstracts  are  preferable  to  full  documents.  But  it  is  more  likely 
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that  a  relevant  document  will  be  rejected  on  the  basis  of  a  title  or  a 
brief  abstract  than  by  seeing  the  full  text.  The  time  lost  through  this 
kind  of  erx’or  of  omission  is  discussed  below. 

Time  spent  rejecting  Irrelevant  material  is  directly  observable. 
The  proportion  or  distribution  of  false  drops  can  be  obtained  by  an  ex¬ 
periment  involving  a  sample  of  requests. 

a .  Omission  of  Relevant  Material 

The  cost  in  time  resulting  from  the  system’s  failure  to  provide 
requested  information,  for  whatever  reason,  is  perhaps  the  most  important 
component  of  total  service  time  and  the  most  difficult  to  estimate. 

It  is  possible  to  determine  the  probability  of  not  finding 
relevant  material  by  performing  an  exhaustive  search  on  a  sample  of 
search  fail'.’res  Rvnec't'f'd  search  time.  T.  then  is 

T  =  (Probability  of  retrieving  information) • (t ime  to 
retrieve  the  information)  + 

(Probability  of  not  retrieving  Information) • (sum  of 
Lhe  Limes  in  the  steps  taken  by  the  user  to  get  the 
information) . 

What  steps  does  a  user  take  when  the  information  he  seeks  is 
not  in  the  output?  If  the  user  has  reason  to  believe  the  information 
is  available  in  the  file,  he  can  rephrase  the  request  and  search  the 
file  again.  If  on  the  other  hand,  he  does  not  think  the  information  is 
in  the  file,  then  he  must  search  elsewhere,  or  proceed  with  his  work 
without  the  knowledge  he  wants. 

The  time  involved  in  resubmitting  a  request  has  been  summarized 
above.  The  time  to  seek  information  elsewhere — i.e.,  other  libraries — 
is  also  observable.  But  if  no  more  searching  is  done,  then  what  is  the 
cost  in  time  to  the  user?  It  is  not  probable  that  this  time  can  be 
measured^  but  it  can  bo  assumed  that  when  a  DRS  does  not  satisfy  a  user's 
first  request,  a  time  penalty  is  incurred.  One  penalty  that  can  be  used 
is  the  average  time  it  would  take  to  search  the  Library  of  Congress  or 
some  other  comprehensive  file  for  the  information. 
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9. 


Cost 


Total  system  cost  is  composed  of  variable  and  fixed  components. 
The  annual  disbursements  connected  with  a  mechanized  DRS  would  include 
these  variable  costs;  salaries,  power  requirements,  material  costs, 
translating  costs  where  documents  are  preprocessed  before  they  are 
entered  into  the  file,  document  enlarging  and  duplicating  costs,  etc. 

Some  of  the  costs  are  initial  (one-time)  costs;  DRS  purchase  price, 
building  construction,  and  system  installation,  duplicating  and  photo¬ 
graphic  equipment,  and  the  initial  establishment  of  the  basic  collection. 

In  the  case  of  a  conventional  library,  the  annual  disbursements 
would  include  these  things;  librarian  salaries,  cost  of  acquisition, 
request  forms,  and  ventilation  and  lighting.  One-time  costs  include 
these;  building,  cost  of  initial  acquisitions,  shelves,  filrlng  cabinets, 
hand  trucks,  etc.  Many  of  these  types  of  costs  are  considered  in  a 
later  discussion  on  DRS  cost  analysis  procedures.  There  is  a  considerable 
body  of  cost  analysis  experience  in  the  digital  computer  field  that  may 
be  applicable  to  mechanized  document  retrieval  systems. 

irurther  study  and  modelling  of  some  of  the  more  basic  consid¬ 
erations  of  the  evaluation  procedure,  such  as  the  possibility  of  convert¬ 
ing  all  of  the  user  requirements  and  system  performance  characteristics 
into  a  uniform  basis  for  comparison  (e.g,,  time  or  cost)  would  seem  to 
be  an  important  long-range  objective. 

D .  Modelling  for  Analysis  of  Operating  Costs 

Efforts  could  fruitfully  be  employed  in  the  further  development  of 
the  programs  and  procedures  for  analyzing  the  operating  costs  of  candi¬ 
date  systems.  Additional  algorithms  and  programming  statements  could  be 
developed  to  make  the  analysis  procedure  more  exact  and  more  applicable 
to  a  wider  range  of  systems. 

Sample  analyses  of  several  representative  systems  and  their  possible 
variations  over  wide  ranges  in  operating  variables  such  as  the  examples 
given  in  Sec.  VI  would  provide  much  useful  information  for  orga)i.l.z;utiona 
considering  the  Installation  of  such  systems.  Much  Interesting 
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information  (e,g.,  the  incremental  cost  of  incorporating  abstracts  in¬ 
stead  of  references  in  a  searching  system^  the  incremental  coat  to  re¬ 
duce  the  over-all  response  time  by  some  specified  factor,  the  most 
economical  equipment  configuration  or  complement  for  a  given  task)  can 
be  obtained  by  running  this  model.  Similarly,  cost  analyses  of  a 
specific  problem  situation  (see  Sec.  VI)  where  the  future  operating 
variables  can  be  estimated  may  be  of  iiiterest  to  organizations  whose 
problem  is  fairly  well  defined. 

E .  Pilot  Tests  or  Pilot  Evaluations  of  Representative  Systems 

Pilot  evaluations  of  representative  retrieval  systems,  either 
operating  or  hypothetical,  would  serve  the  doubly  useful  purpose  of 
providing  a  check  on  the  evaluation  techniques  as  well  as  providing 
useful  information  about  the  particular  systoms. 

F .  Basic  Studies 

There  is  a  need  for  continuing  basic  research  to  determine  the 
following: 

(1)  How  the  user’s  productivity  is  related  to  the  type 
and  amount  of  information  services  provided  (i.e.. 

What  is  the  gain  in  user  productivity  from  increasing 
incremental  amounts  of  information?) ; 

(2)  How  the  search  needs  are  related  to  the  tasks  required 
of  the  individual  (i.e.,  What  types  of  information  or 
searches  are  required  for  different  types  of  Jobs?) . 
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APPENDIX  A 


HANK  CORRELATION  METHODS  APPLIED  TO  QUESTIONNAIRE  RESULTS 

One  of  the  principal  tasks  of  the  project  is  to  develop  a  ranking 
of  the  importance  to  the  users  of  performance  characteristics  of  stor¬ 
age  and  retrieval  systems.  We  do  this  by  analyzing  individual  rankings 
obtained  from  a  sample  of  the  user  population. 

We  are  concerned  with  two  problems: 

(1)  Measuring  the  agreement,  or  concordance,  among 
the  individual  rankings,  and 

(2)  Estimating  the  "true”  ranking  of  the  performance 
characteristics . 

We  can  answer  both  questions  by  using  rank  correlation  methods, 

7 

The  following  example,  based  on  a  problem  in  Chapter  6  of  Kendall 
illustrates  the  procedure  for  computing  the  degree  of  concordance  among 
the  rankings  and  testing  its  significance. 

Consider  the  three  rankings  of  seven  characteristics: 


The  sum  of  squared  deviations  about  the  mean  is  S  =  220.5. 

Is  the  computed  value  of  S  significant?  That  Is,  does  S  =  220.5, 
based  on  the  three  rankings  of  seven  objects  indicate  that  P,  Q,  and  R 
agree  among  themselves? 
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To  tost  the  significance  of  some  sample  statistic,  such  as  S,  the 
observed  value  of  S  is  cowpai'ed  with  the  entries  in  a  frequency  distri¬ 
bution  of  all  values  the  sample  statistic  may  take  on.  Each  of  the 
possible  values  in  the  frequency  distribution  has  a  certain  probability 
of  occurrence.  If  the  probability  that  a  random  occurrence  of  the  ob¬ 
served  value  of  the  statistic  is  sufficiently  low  (say  .05)  then  we 
may  conclude  tiiat  the  observed  value  is  significant.  In  the  present 
context,  a  significant  value  of  S  implies  that  the  rankings  P,  Q,  and 
R  agree • 

To  test  the  significance  of  S,  we  consult  a  table  whose  entries 

are  the  probabilities  of  exceeding  various  values  of  S .  Such  a  table 
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is  found  in  Kendall's  book.  For  three  rankings  of  seven  objects,  the 

probability  that  the  observed  value  exceeds  185.6  is  .01.  In  other 
words,  if  inn  groups  of  three  individuals  wei-o  to  rank  seven  objects 
randomly,  the  expected  number  of  times  that  the  calculated  value  of  s 
exceeds  185.6  is  one.  Since  the  observed  value  of  S  =  220,5  exceeds 
the  value  for  1  percent,  the  concordance  among  P,  Q,  and  R  cannot  be 
explained  satisfactorily  by  chance  alone,. 

We  now  ask  what  is  the  best  estimate  we  can  make  of  the  true 
ranking  of  the  objects?  Our  answer  is  to  rank  the  objects  according 
to  the  sums  of  ranks  alloted  to  the  characteristics.  For  the  above 
example  this  gives  the  rsinking:  A  B  C  D  E  P  G. 
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APPENDIX  B 


TECHNIQUE  FOR  COMPUTING  A  MEASURE  OP  AGREEMENT  BETWEEN  A 
REQUIREMENT  AND  A  SYSTEM'S  PERFORMANCE  FOR  THAT  REQUIREMENT 


94 


APPENDIX  B 


TECHNIQUE  FOR  COlflPUTING  A  MEASURE  OF  AGREEMENT  BETWEEN  A 
REQUIREMENT  AND  A  SYSTEM'S  PERFORMANCE  FOR  THAT  REQUIREMENT 

The  first  perfonnance  evaluation  procedure  requires  as  an  inter¬ 
mediate  step,  a  computation  of  the  measure  of  agreement  (Index)  between 
a  requirement  and  a  system's  performance  for  that  requirement.  This 
Appendix  describes  the  procedure  for  this  computation,  as  applied  to 
two  specific  requirements:  (1)  a  requirement  to  minimize  the  time  to 
get  the  major  group  of  relevant  references,  and  (2)  a  requirement  to 
minimize  the  amount  of  irrelevant  material  produced.  Because  both 
indexes  are  derived  in  a  similar  way,  only  the  derivation  for  the  first 
requirement  is  presented.  The  method  can  be  extended  to  other  require¬ 
ments  . 

Minimum  Time  Requirement 

The  average  service  time  per  search  will  be  used  to  measure  how 
well  a  DRS  satisfies  the  first  requirement.  To  compute  this  statistic 
the  distributions  of  DRS  service  time  and  user  waiting  time  must  be 
compounded . 

User  Waiting  Time 

Let  be  the  number  of  users  who  will  wait  as  long  as  time  t  for 
search  results,  and  let  N  be  the  total  number  of  users.  Table  B-1  shows 
the  proportion  n^/N,  of  users  willing  to  wait  until  time  t  for  the  re¬ 
levant  references.  The  data  in  Table  B-1  were  derived  from  00  responses 
to  Question  ilc  in  the  questionnaire.  Figure  B-1  is  a  graph  of  the  data 
shown  in  Table  B-1 . 

Figure  D-1  suggests  that  the  distribution  of  n^/N  is  exponential. 

As  applied  to  this  problem,  the  exponential  assumption  means  that,  in 
the  discrete  case. 


\  -  "t-1  l^<N-«t-l> 
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FiG.  B-1  PERCENT  OF  RESPONDENTS  WILLING  TO  WAIT  UP  TO  T  DAYS 
FOR  MOST  OF  THE  RELEVANT  REFERENCES 
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Table  B-1 


PROPORTION  OF  USERS  WILLING  TO  WAIT  UNTIL  TIME  t  FOR  THE 
RELEVANT  REFERENCES 


Max  time  to  get 
relevant  references 
(days) 

Interval 

mid-point 

(day) 

"t 

N 

<  1 

0.5 

96.9 

2-3 

2.5 

83.4 

4-13 

0.5 

67.0 

14-49 

31.5 

25.1 

61-183 

121.5 

5.3 

>  183 

— 

0.0 

where  k,  the  "decay  constant,"  is  the  reciprocal  of  mean  user  waiting 
time.  Thia  difference  equation  says  that  the  number  of  respondents  in 
the  interval  from  tine  t-1  to  t  is  proportional  to  the  number  of  re¬ 
spondents  not  satisfied  before  time  t-1.  The  continuous  analog  of  this 
difference  equation  is 


dn^  =  'k(N-n^)dt 


or 


dn. 


(N-n^)  dt 


which  integrated  gives 


c  +  log(N-n^)  =  kt 

where  c  is  the  constant  of  integration.  At  t  =  0,  n^  =  0,  so  that 
c  =  -log  N.  Therefore 

log(N-n^)  -  ~kt  +  log  N 
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Solving  for 

=  N(l~e 

Finally 


n^/N  =  1-e 

The  quotient  h^/N  is  the  proportion  of  respondents  who  want  search  re- 
suits  by  time  t  at  the  latest. 

The  value  of  k  could  be  estimated  by  the  least-squares  fitting 
technique.  However^  the  resulting  value  would  be  heavily  influenced 
by  one  outlying  point;  the  2-6  months  interval  point.-  If  this  inter¬ 
val  boon  2-3  months  and  the  change  had  not  affected  the  responses^ 
then  the  exponential  assumption  gives  a  very  good  fit.  When  the  out¬ 
lying  point  is  ignored^  the  slope  of  the  line  in  Pig.  B-1^  k^  is 
approximately  k  :•=  0.02. 

DRS  Service  Time 

No  empirical  data  are  available  on  DRS  service  time,  although  such 
data  could  be  developed  through  a  program  of  experimentation  on  proto¬ 
type  systems.  In  the  following  analysis  the  DRS  service  time  distribu¬ 
tion  is  denoted  by  g(t) . 

Average  Service  Time  per  Search 

Figure  B-2  will  help  explain  how  the  average  service  time  per 
search  statistic  is  computed. 

The  abscissa  represents  user  waiting  time,  and  the  distribution 
below  the  x-axis  shows  the  proportion  of  users  willing  to  wait  up  to 
the  corresponding  time  on  the  x-axis.  Thus,  the  dark  area  below  time 
dt  is  the  proportion  of  users  willing  to  wait  for  search  results  till 
time  dt . 
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Time 


Fig.  B-2 

USER  WAITING-TIME  DISTRIBUTION 

The  ordinate  of  Fig.  B-1  is  also  measured  in  units  of  time,  in 
this  case  the  amount  of  time  required  by  a  DRS  to  satisfy  a  search  re¬ 
quest.  The  distribution  appended  to  the  ordinate  is  the  probability 
that  a  DRS  will  satisfy  a  search  request  by  the  given  time.  The  dark 
area  to  the  left  of  time  interval  dt  is  the  probability  the  search  is 
satisfied  in  that  interval. 

Consider  a  single  user,  one  willing  to  wait  up  to  time  dt  for 
search  results.  This  user  will  generate  many  searches — some  that  can 
be  serviced  quickly,  others  that  will  take  a  long  time  to  satisfy.  It 
is  assumed  that  the  search  times  required  to  satisfy  his  requests  are 
distributed  uniformly  over  time.  The  column  with  base  dt  represents 
the  searches  generated  by  the  particular  user.  Of  these  searches,  only 
those  satisfied  by  time  dt — that  Is,  the  cross-hatched  area  in  the 
column — are  successful. 
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Let  — —  -  proportion  of  users  willlnij  to  wait  until  time 
gCt^)  =  probability  a  search  is  completed  by  time  t^. 


Then  the  average  time  per  search,  T,  is 


s(t^)  dt^  dt^ 


But  by  a  previous  result 


Therefore 


or 


T  = 


t  ^  -kt 
r:r  =  1-e 


r  A  ■‘"^1 

J  J 

o  o 


KB  ^'■'■2''  '^''2  '^“'1  ’ 


oc 

-/ 


-kt. 


S(t^)  dt^ 


T  =  J  e 
o 

This  is  as  far  as  the  analysis  can  be  carried  without  knowing  the  form 
of  g<t) . 

If  the  DRS  service-time  distribution,  g(t)  is  exponential,  then 


T 


where  1/a  is  the  mean  DRS  service  time,  and  1/k  is  the  mean  user  waiting 
time . 

Note  that  as  1/a  becomes  large  relative  to  1/k  the  quotient  approaches 
zero.  Conversely  when  1/a  becomes  small  relative  to  1/k  the  quotient 
approaches  1 .  Therefore, 

0  <  T  <  1 
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An  assnjnption  underlining  the  above  analysis  is  that  user  waiting 
time  and  DRS  service  time  are  Independent.  This  may  not  be  true.  It 
is  possible  that  users  who  are  willing  to  wait  a  long  time  for  search 
results  are  the  ones  whose  search  requests  normally  take  a  long  time  to 
satisfy.  If  the  independence  assumption  is  false^  then  T  will  not  be 
an  accurate  measure,  even  though  it  may  not  be  biased  toward  any  parti¬ 
cular  DRS . 

Minimum  Irrelevant  Material  Requirement 

As  stated  earlier,  the  derivation  of  the  minimum  irrelevant  material 
index — called  the  average  percentage  of  false  drops  per  search,  and 
signified  by  D — is  not  presented.  The  steps  followed  in  deriving  T 
can  be  repeated  to  derive  D.  The  appropriate  distributions  in  this 
case  are  the  percent  of  users  v/illing  to  accept  up  to  d  false  drops, 
and  the  probability  that  a  DRS  will  produce  d  false  drops. 

The  result  is 


1 


when  i  is  the  mean  number  of  false  drops  acceptable  to  users  and  i  is 
o  c 

the  mean  number  of  false  drops  produced  by  the  DRS .  Again  0  <  D  <  1 . 
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APPENDIX  C 


GENEKAL  COST  ANALYSIS  PROGRAM  WITII  SPECiriC  DATA  INSERTED 
FOR  THE  ANAIA'SIS  OF  COMPUTER  SYSTEM  A 
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GENERAL  COST  ANALYSIS  PROGRAM  WITH  SPECIFIC  DATA  INSERTED 
FOR  THE  ANALYSIS  OF  COMPUTER  SYSTEM  A 


BURROUGHS  ALGEBRAIC  COMPILER  -  STANDARD  VERSION  7/25/611 
INTEGER  I  iJ.K  ,M.N*Tt 
INITIALCONDITIONS.. 

N*lOJ  ARRAY  I NPUT RATE ( 10 )= (0. 50, 100 »5000i 10000*15000 >2 0000 *30000, 
35000* AOOOO ) S 

M  =  9  3;  ARRAY  SE ARCHLO AD  (  9  )  =  (  0  *  1 0  *  50  *  1  00  *  250  »  5  00  *  750  *  1000 , 2000  )  S 
FIlESI/E=05 

ROFR=0.07  $  COMMENT  IHIS  IS  RATE  OF  RETURNS 
YEARS=5,0  S  COMMENT  EOUIPMENT  AMORTIZATION  PERIODS 
BURDEN=n. 1 5S 
0VERHEAD=n.?5$ 

CONSTANTS. . 

ARRAY  ANSWFR(in,OllOO,A)*C(100),R(100)$ 

M!NS=1020O,ns 
H0UR5=1 70.ns 

FORMAT  HEADERl  (PA0,»COST  CALCULATIONS  (IN  DOLLARS  PER  YEAR)**W3)* 
HEADER2  ( 

•COMPUTER  SYSTEM  A  * 


,WA  )  ♦ 

HEADERS  (*IN1TIAL  FILE  S I ZE*  .  S 1 0 . 0  *  WO , 
•  RATE  or  r!ETURN.*jS5j3*wO* 


•  AMORTIZATION  PERIOD*. S5.3*W0* 

•BURDEN  RATF**S5.3,WO* 

•OVERHEAD  RATE* ,S5 . 3  ,W0 ) * 

HEA0ER4  (*INPUT  RATE* *P20  .*5EARCH  VOLUME  (NUMBER  OF  SEARCH* 


*ES  PER  M0NTH)**W4) * 

HEADERS  (*( ITEMS/MO. I**W2) . 

HEADERS  ( BIO, 11X10. 0*W2 ) » 

COSTFORMAT  ( X8,0* B2 ♦ 1 1 XIO.O *W0 ) S 
OUTPUT  HEADERL1NE3  ( F I LESI ZE ,ROFR , YE AR S *  BURDEN . OVERHEAD )  , 
HFAOFRI  1NE6  IFOR  J=(1,1,M)  SSE ARCHLOAD  (  J )  I , 
COSTLINt  (  INPUTRATEI  1  )  .FOR  J=  (  1  ,1  ,M)  SANSWERUDS 


PARAMETERS.. 

R( 1 1=1000$ 
R(21=720S 
R ( 3 1 =5005 
R ( 4 ) =350$ 
R( 5)=350I 


D( 1 ,3 )=0$ 

D(?,3)=20S 

0(3,1)=?1,5S  D(3,?)=20$ 

D(4,l)=l$  D(4,2)=5S 

0(5,1) =12$  D(5,2)=16$ 


R(6)=75nt 


D(6.3)=3$ 


R ( 7 ) =250$ 


D  (  7 , 3  )  =  1  S 


R(?5)=600n$  D(25,l)=6$ 


D( 25,2  )=8$ 


K( 26  I  =  1 OOOOOnsDl 26 . 1 ) =0 . 0O75$D ( 26 , 2 ) =0 ,0 5$  D( 26 , 3 ) = 0. 0 5/ 20000 . 0$ 


R ( 61 ) =0,001 4$  0(61.11=6.1$  0(61,2)=B.]$ 

R(62)-75$ 

R(  81  )  =2000$  D(81»n=0. 00  75$D181, 21=0.05$ 

D( 81 ,41=1000$ 


START., 


D(62, 31  =  1, 0/200000. OS 
D(81 .3  1=0.05/20000,0$ 


WRITE  (ISHFADFRllS 
WRITE  (t$HFAnER2)$ 

WRITE  (  SSHEADFRL 1NE3, HEADERS) $ 
WRITE  ($$HEADER4)S 
WRITE  (SSHEADERSIS 
WRITE  ($$HEADERLINE6,HEADFR6)$ 
IF  M  GTR  11  $  M=ll$ 
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MI NT=I<0FR/1 2$C0MMFNT  THIS  IS  MONTHLY  EQUIV.  NOMINAL  INTERESTl 
COMMENT  COMPUTE  CHARGESS 

CHARGES'*  1  +BURDEN+OVFRHEAr)+BUR  DEN,  OVERHEADS 
AMORT= (MI  NT { 1+MINT ) » (YEARS. 12 ) ) / ( ( 1+MINT  )» ( YEAR5.12  )-n  S 
ITEM..  FOR  1=(1.1,N1S  BEGIN 

items=inputrate(  n$ 

FOR  K=(l,l,n)$  AN5WER(K)=0$ 

SEARCH. .FOR  J=(1,1,M)  $  BEGIN 
SIZ£=FILFSI2E$ 

S£ARCHES=5EARCHL0A0( J ) S 
TIME..  FOR  T=(n,-1,0)$  BEGIN 

FOR  K-'(i,i,sn)$  r(ic)=n$ 

TOT1 =TCT?=TOTS=TOT4=OS 
COMMENT  COMPUTE  LABOR  TYPE  A  COSTS 

IF  R( 11  nfq  os  begin 

EITHER  IF  ENTIRE  t ( D I  3, 1 ) . I TEMS  +  D I  3 . 2 ) . SEARCHES ) /MI N5  +  1 ) 
+ENT1RE  ((0(A,1) . ITEMS+D( 4,2 ) .SEARCHES) /MINS+1 ) 

+FNT1RE  ((D(5,l). I TEMS+D( 6 , 2 ). SEARCHES ) /MIN5+1 ) 

GTR  D(l,3)$  C ( 1  )-C( 1 )+R( 11 $0THERWISESC( 1 ) =0S 
ENDS 

COMMENl  COMPUTE  LABOR  TYPE  B  COSTS 

tP  (RI?l  NFQ  0)  AND  (D(2»3>  NEO  0)S  BEGIN 
C(?)=C(2)+R(2) (ENTIRE  (( 

ENTIRE  ((0(3,1)  .  !TEM54-D13,2  )  .  SEARCHES )  /M I  NS  +  1 ) 

+eNTIRE  ( (D(4,l) .ITEMS+D(4,2 ) .SEARCHES) /Ml NS+1) 

+ENT1RE  ( (D(3.1 ) . lTEHb+0( 5,2 ) .SEARCHES) /MINS+1 ) 
1/D(2,3) ) 

1 

ENDS 

COMMENT  COMPUTE  l.ABOR  TYPE  C  COSTS 
IF  R(3)  NEQ  0  S 

C(3)=R( 31 (ENTIRE  (DI3,1 ). ITEMS/ MI  NS 
+0(3,2) .SEARCHES/ Ml NS+1 ) ) S 
COMMENT  COMPUTE  LABOR  TYPE  D  COSTS 
IF  R(4)  NEO  0  S 

C( 41 =R (4 )  (ENT  IRE  ( 0(4 ,1 ) . I TEMS/MINS 
+D (4 ,2 ) ,SEARCHES/M1NS+1 
+0(4,3) .SI2E/H0URS 
+0(4,1) .SEARCHFS.SIZF. 0(63, 3) /MINS 
)  )S 

COMMENT  COMPUTE  LABOR  TYPE  E  COSTS', 

IF  R(5)  NE(^  0  S 

C(51=R(5) (ENTIRE  (0(5,1 (.ITEMS/ Ml  NS 
+0(5,2) .SEARCHF S/MI NS+1 ) )S 
COMMENT  COMPUTE  LABOR  TYPE  F  COSTS 
IF  R(6)  NFQ  0  $ 

C(6)=R(6) .D(6,3)$ 

COMMFNT  COMPUTF  1  AROR  TYPE  G  COSTS 
IF  R( 7 )  NEQ  0  S 

C ( 7) =P ( 7 ) .D( 7,3  )S 

COMMFNT  COMPUTF  EQUIPMFNT  TYPF  A  COSTS 
IF  R( 21 )  NFQ  0  S 

C(21)=R(?1){ENT1RE  (D(21,3).S1ZF+11 )$ 

COMMENT  COMPUTE  EQUIPMENT  TYPE  E)  COSTS 
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IF 


COMMENT 

IF 


COMMENT 

IF 


IF 


COMMENT 

IF 


IF 


COMMENT 

IF 

COMMENT 

IF 

COMMENT 

IF 

COMMFNT 

IF 


IF 


COMMENT 


COMMENT 

IF 


R(2?)  NEQ  0  $ 

FOR  K=(1,1,20)S 

IF  CIK)  NEO  0  £  Ct22 1 =C(22)+R( 22 ) .0(22.31 . (C(K) /RIK)  )$ 
COMPUTE  EQUIPMENT  TYPE  C  COSTS 
RI23)  NEQ  0  $ 

IF  C(4)  NEQ  0  S  C(2'3)=R(23). 0(23. 3). (ENTIRE  (0(4,11, 

I TFMS/MINS+D(4,2 1 . SEARCHES /MI NS+1 ) ) $ 

COMPUTE  EQUIPMENT  TYPE  0  COSTS 
(R(25)  NEQ  0)  AND  (0(25,4)  EQL  01$ 

C(25)=R(26) ( ENT  1  re (D( 25,1). items/ (2. MINS) 

+  IT(  25,2  )  .SEARCHES/ (2. M INS) +  1  )  )  $ 

(R(25)  NEQ  0)  AND  (0(25,4)  NEQ  OlSBEGIN 

TEMP=ENTIRE(D(25,1 ).ITEMS/MINS+D(25,2I,SEARCHE5/MINS+1)$ 
C(25)»R(25) (ENTIRE  ( ( TEMP  +  1 ) /2 ) > +0 ( 2 5 ,4 ) ( ENT  1  RE  (TEMP/2))$ 
ENDS 

COMPUTE  EQUIPMENT  TYPE  E  COSTS 
(R(26)  NEQ  0)  AND  (D(26,4)  EQL  0)$ 

C( 26)=R(26) (ENTIRE  (D(26,l ) . I T EMS/ ( 2 .M I  NS ) 
fD(26,2 ) .SEARCHES/! 2. MI  NS ) 

+D(?6,3) . S IZE. SEARCHES/ (2,M1 NS )+l) )S 
(R!26)  MEQ  0)  AND  (0(26,4)  NEQ  OSBEGIN 

TEMP=ENTIRE  (0(26,1 ) , I TEMS/M I NS+D ( 26 , 2 ) .SEARCHES/MINS 
(0(26»3).SI/t«StARCHES/MiNS+i)i 
C(26)'R(26)  (ENTTOF  (  (  TPMpj-i  ) /2  )  ) +n  (  26 ,4  )  (  FNT  I  R  F  {TFMP/2))$ 
ENDS 

COMPUTE  MATERIA!.  TYRE  A  COSTS 
R(61 I  NEQ  OS 

C(61)=R(61)(  D!61,1).ITEMS+D(61, 2). SEARCHES  )  S 

COMPUTE  MATERIAL  TYPE  B  COSTS 
R(62)  NEO  OS 

C(62 )=R(62) (ENTIRE (0(62 ,3 ) .SIZE+10  )  )  $ 

COMPUTE  MATERIAL  TYPE  C  COSTS 
R(63l  NEQ  ns 

C(  63  I =R(63) (SEARCHES,S!ZF.D( 63,2) »D( 63,3) ) S 
COMPUTE  Ml.SC.  TYPE  A  COSTS 
(R(81)  NEO  0)  AND  (D(81,4)  FOL  0)$ 

C(81)=R(8n(ENTIRE  (D(81,l)  ,  I  T  EMS/  (  2  .  M I  NS  ) 

+D( 81 ,2) .SEARCHES/ (2. MINS) 

+D(81 ,3) , SIZE., SEARCHES/ (2. MI  NS  )  +  ] ) )$ 

(R(81)  NEO  0)  AND  (D(8I,4)  NEO  0)$BEG1N 

TEMP=ENTIRE  (0(81.1) . 1 T EMS /MI NS+D ( 81,2) .SEARCHES/MINS 
+D(81 .3). 5 IZE. SEARCHES/ MI NS+1 )S 
C( 81  )^R(81 ) (ENTIRE  ( ( TEMP  + 1 ) /2 ) ) +D ( 8 1 , 4 ) ( EN T I  RE  (TEMP/2))S 
ENDS 

COMPUTE  TOTAL  LABOR  COSTS 
FOR  K,=  (1,1,20)S 

TOTl=TOTl+r(IC)  S 
TOTl =T0T1 .CHARGFSS 
COMPUTE  TOTAL  EQUIPMENT  COSTS 
AMORT  NEO  OS  BEGIN 
FOR  K= (21 ,1  .60)  S 

TOT2»TOT2+C(K)$ 

TOT2=TOT2,AMORT$ 

ENDS 
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COMMENT  COMPUTE  TOTAL  COSTS 
FOR  K=(61 .i.lOOlS 

TOT3=TOT3+C(K)$ 

ANSWERt  JI=ANSWER(  J)  +  (TOT1+TOT2  +  TOT3>  t  (  T+MINT)»T$ 
SIZE=SIZE+ITEMS$ 

ENDS 

ENDS 

WRITE  (  SSCOSTLINE.COSTFORMAT ) S 
ENDS 

El NISHS 

compiled  program  ends  at  1183 
PROGRAM  VARIABLES  BEGIN  AT  37Z5 


APPKNDIX  n 


COST  ANALYSIS  PROGRAM  FOR  CASH-FLCTV 
DATA  INSERTED  FOR  THE  ANALYSIS 


A:ND  PRESENT-WORTH  COMPUTATIONS  WITH 
OF  EDGE-NOTCHED  CARD  SYSTEM  A 
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COST  ANALYSIS  PROGRAM  FOR  CASH  FLOW  AND  PRESENT  WORTH  COMPUTATIONS  WITH 
DATA  INSERTED  TO  R  THE  ANALY  SIS  OF  EDGE-NOTCHED  CARD  SYSTEM  A 


BURROUGHS  ALGEBRAIC  COMPILER  -  STANDARD  VERSION  7/25/61$ 

INTEGER  I,J.K,M.N.T$ 

INITIALCONOITIONS.. 

N=10S  ARRAY  INPUTRATE (10)=t650»850*1000*115D.1250.1325»1325. 
1325.:i325il325)$ 

ARRAY  SEARCHLOAO I  10  )  =  (  20 . 1  50  » 300  *  850  ♦  1  2  50  »  1 32  5  » 1  500  . 
1575.1700.1825)$ 

E !LESI2E=0$ 

ROFR=0.07  $  COMMENT  THIS  IS  RATE  OF  RETURNS 
YEARS“5.0  $  COMMENT  EQUIPMENT  AMORTIZATION  PERIODS 
BUR0EN’'0. 15$ 

OVERHEAD=0.25$ 

CONSTANTS.. 

ARRAY  ANSWER! 1 1 ) .0 ( 100. 4 ) .C ( 1 00 ) .R ( 100 ) $ 

MIN5=10200.0$ 

H0UR,S  =  1  70.0$ 

FORMAT  HEADERl  (R40.-»COST  CALCULATIONS  (IN  DOLLARS  PER  YEAR)*.W3), 
HEADER2  ( 

*EOGE~NOTCHED  CARD  SYSTEM* 

,W4  )  • 

HEADER3  ('‘INITIAL  FILE  S  I  ZE*  »  S 10  .  O  .WO  , 

»SATE  OF  RETURN*. S5. 3, WO. 

"AMORTIZATION  PER  I OD* .S5 , 3 . WO . 

•BURDEN  RATE«.S5.3.W0. 

"OVERHEAD  RAT E* .S5. 3 . WO ) . 

HEADER4  (*YEAR  INPUT  RATE  SEARCH  L0AD*»W4). 

HEADER41  (Bb.*( ITEMS/MO. )  I  SEARCHES/MO .)*. W2 I . 

HEADERS  (N(B1. I2.B3.X10.0.B3.X10.0.WO)  )  » 

HEADERS  (*YFAR  MONTH  C:0ST».W2!. 

HFA0ER7  ("YEAR  EOUIV,  ANNUAL  COST  PRESENT  W0RTH».W.1|, 
TOTALFORMAT  ( B 1 . I  2 . 0  .B4 , I  2 .0 ♦ B2  .  X 1 0 . 0  .  CO )  . 

COST FORMAT  (B1.I2.0.B6.X10.0»B6»X10,0»WO)$ 

OUTPUT  HEADERLINE3  ( F  U.E SI ZE ,ROFK , YEARS » BURDEN .OVERHEAD ) . 

HEADERLINE5(F0R  1  =  11.1. N)$!  1.  INPUTRATE I  I  ) .SEARCHLOADl 1) 

>  )  . 

COSTLINF  (I.AC.PW), 

TOTALl INE  ! I .12-T. TOTALIS 
PARAMETERS.. 

R(l)=10no$  D(1.3)=5$ 

R( 2 1=720$  D'2 .31=20$ 

R(3)=500$  D(3,ll=21.5$  0(3.21=35.25$ 

P{4)=350$  D! '1,1 1=7.25$  0(4.21  =  16.5$  0(4.31  =  1.1/30000$ 

R(21)=100$  0(21.3; =1.0/60000.0$ 

R(22)-260I  0(22.31=1$ 

R!23)-1201  0(23.31=1$ 

R(25)=1500$  0(25.11=1.0/20000.0$ 

R(61)=0,02$  0(61.11=1.01$ 

R (631=0. 02$  0(63.31=1,01/500.0$ 

START.. 

WRITE  ($$HFA0FR1)$ 

WRITF  ($$HFA0FR2)$ 

WRITE  ($$HEADERLINE3.HEADER3)$ 

WRITE  ($$HFA0FR4)$ 

WRITE  ( $$HEADER41 1$ 
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WRITE  ($'iHEADERLlNE5,HEADER5)S 
WRITE  (t$HEADER6I$ 

IF  M  GTR  11  $  M=1U 

MINT=R0FR/12$C0MMENT  THIS  IS  MONTHLY  EQUIV.  NOMINAL  INTERESTS 
COMMENT  COMPUTE  CHARGESS 

CHARGES=l+BUROEM+OVERHEAD+BURDEN. OVERHEADS 
AMORT  =  (MINT  !  1  +  M I  NT  )  *•  t  YEARS.  12  11  /  (  I  1 +MI  NT  )  *  (  YEARS.  12  )-l  )  $ 
ITEM..  FOR  I=(1»1*N)$  BEGIN 

ITFM5= INPUTRATEI I  IS 

searches=searchload  ids 
TIME..  FOR  T=(11,-1.0)S  BEGIN 

FOR  K=(1»1»‘>0)S  C(K)=0S 
TOT1=TOT2=TOT3=TOT4=0$ 

COMMENT  COMPUTE  LABOR  TYPE  A  COSTS 
IF  R ( 1 1  NEO  OS  BEGIN 

EITHER  IF  ENTIRE  t  ( 0 ( 3 , 1 1 . I T EM5  +  D ( 3 . 2 ) . SEARCHE S » /M I  NS U ) 
+ENTIRE  ( (0(4*1 ) . ITEMS+0(4.2 ) .SEARCHES ) /M 1 NS+1 ) 

+  ENT I  RE  ( (D( 5*1 ). ITEMS+0( 5.2 ) . SEARCHES ) /M I NS  +  1 ) 

GTR  0(1.3)$  C( 1 )=C(1 )+R( 1 )$OTHERWlSE$Cr 1 )«0I 
ENDS 

COMMENT  COMPUTE  LABOR  TYPE  B  COSTS 

IF  (Ri?)  NEO  0)  AND  (0(2.3)  NEO  0)$  BEGIN 
(;(  2)  =C  (2  )+R{2)  (ENT  IRE  (( 

ENTIRE  KOI  3.1) . 1 T EMS+D( 3 > 2 ) .SEARCHES) /MlNS  +  l ) 

+ENTIRE  ( (0(4.1 ) .lTEMS+0( 4.2 ) .SEARCHES ) /MI NS+1 ) 

+ENTIRE  ( (0(5.1 ) .IT EMS+D{ 5.2 ) .SEARCHES ) /M I NS+1 ) 
)/D(2.3)  ) 

) 

ENDS 

COMMFNT  COMPUTE  LABOR  TYPE  C  COSTS 
IF  R{3!  NEQ  0  $ 

C(3)=R(3)  (ENTIRE  ( 0 ( 3 . 1 ) . I TEMS/M I  NS 
+0(3.2). SEARCHES/M! NS+1 ) )$ 

COMMFNT  COMPUTE  LABOR  TYPE  0  COSTS 
IF  R(4)  NEO  0  S 

C(4) =R(4) (ENTIRE  ( 0 ( 4 . 1 ) . I TEMS/M I  NS 
+0(4,2) .SFARCHES/M1N5+1 
+0(4.3) .SIZE/HOURS 
+0(4,1)  .SEARCHES.  SI  ZE,l)(  63, 3)  /MINS 
)  )S 

COMMFNT  COMPUTE  LABOR  TYPE  E  COSTSS 
IF  R(5)  NEO  0  $ 

C( 5! =R( 5 ) (ENTIRE  (0(5,1) . I TEMS/M I  NS 
+0( 5 ,2) .SEARCHES/M INS+1 ))* 

COMMENT  COMPUTE  LABOR  TYPE  F  COSTS 
IF  R(6)  NEO  0  t 

C(6)=R(6).D(6,3)1 

COMMENT  COMPUTE  LABOR  TYPE  G  COSTS 
IF  R(7)  NEQ  0  $ 

C(7)=R(7).0{7,3)i 

COMMENT  COMPUTE  EQUIPMENT  TYPE  A  COSTS 
IF  R(21)  NEO  0  S 

C(21 )=R( 21 ) (ENTIRE  (D(21,3).SIZE+1))$ 

COMMENT  COMPUTE  EQUIPMENT  TYPE  8  COSTS 
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ir  R( 2?)  NFo  n  I 

FOR  lt  =  (l  ,  1  ,201$ 

IF  C(K)  NEQ  0  $  C{22 1 =Ct22l+Rt 22 1 .D( 22t3) . tC( K) /R(K)  )  * 
COMMENT  COMPUTE  EQUIPMENT  TYPE  C  COSTS 
IF  R( 231  NEQ  0  t 

IF  Cl'i)  NEQ  0  S  C(23)=Rt23).D(23. 3). (ENTIRE  (D(4tl), 
ITEMS/M1NS+D(4t2 ) 4 SEARCHES/M I NS  + 1 )  )  $ 

COMMENT  COMPUTE  EQUIPMENT  TYPE  0  COSTS 

IF  (R(26l  NEQ  01  AND  (0(23»41  EOL  01S 

C(251=R(2‘>i  (ENTIRE(D(25*l)4lTEMS/12.MINS) 

+0(23,2 J. SEARCHES/! 2. MINS) +1) )S 
IF  (R(25)  NEQ  0)  AND  (D(25»4)  NEO  OUBEGIN 

TEMP  =  ENTIRF(D<  23,1  )  .  I  TENS /M I NS  + D  1  25 , 2 .  SF  ARCHES/M  I  NS+ 1  )$ 
C(25)=R(25) (ENTIRE  ( ( TEMP  +  1 ) /2 ) ) +D ( 25 ,4 ) ( ENT  I  RE  (TEMP/2))t 
ENOS 

COMMENT  COMiPUTE  EQUIPMENT  TYPE  E  COSTS 

IF  (R(261  NEQ  0)  AND  (D(26,4»  EOL  0)S 

C(26)=R(761  (ENURE  (0(26,1).  I  TEMS/ (  2  ,M  I  NS  ) 

+  0(26,2) .SEARCHES/ (2.M I  NS) 

+  0(26,3) .SI ZE.SE ARCHES/ ( 2.M1 NS )  +  l )  )$ 

IE  (R(26)  NEQ  0)  AND  (0(26,4)  NEQ  0)SBEGIN 

TEMP=fnT IRE  (0(26,1) . lTEMS/MlNS+0( 26,2 ) .SEARCHES/MINS 
•rO  (  26,3  )  .SlZEiSEAP.CHES/MI  NS  +  I  )  $ 

C(261-R'26) (ENTIRE  ( ( TEMP  + 1 ) /2 ) ) +0 ( 26 ,4  I  ( ENT  1  RE  (TEMP/2))S 
ENOS 

comment  COMPUTE  MATERIAL  TYPE  A  COSTS 
IF  R(61 )  NFO  OS 

C(6i)=r<(6n(  0!  61 , 1  ).  I  TEMS  +  0(  61 ,2  )  .SEARCHES  )  S 

COMMENT  COMPUTE  MATER  1  At  TYPE  B  COSTS 
IF  R(62)  NEO  OS 

C ( 62 )=R (62) (ENT IRC( 0(62,3). SIZE+10  )  )$ 

COMMENT  COMPUTF  MATFRIAL  TYPE  C  COSTS 
IF  R(ft3)  NFO  OS 

C ( 63 1 =R(63) (SEARCHES.SI ZE.Ot  63,2 ) .0(63,3))$ 

COMMENT  COMPUTE  MI  SC.  TYPE  A  COSTS 

IF  (R(8l)  NEQ  0)  AND  (0(81(4)  EOL  0)S 

C(an=R(81)(ENTlRE  (0(81,1).ITFMS/(2.MINS) 

+0(81,2) .SEARCHES/ (2. MI  NS) 

+0(81 ,3).SI7£. SEARCHES/ (2.MI NS) +1 ) )$ 

IF  (R(81)  NFQ  0)  AND  (0(81.4)  NEQ  0)SBEGIN 

TEMP  =  ENTIRE  (0(81,1 ) . I T EM5 /M I NS  +  D ( 8 1 ,2) . SEARCHES/M  I  NS 
+0(81,3) .SIZE. SEARCHES/M  I NS+1 )$ 

C( 81 ) =R( fil ) (ENTIRE  ( ( TEMP+ 1 ) /2 ) ) +0 ( 81 , 4 ) ( ENT  I  RE  (TEMP/2))S 
ENOS 

COMMENT  COMPUTE  TOTAL  LABOR  COSTS 
FOR  K  =  ( 1  ,1  ,20)$ 

TOT1=TOT1+C(K)$ 

TOT1=TOT1.CHARGESS 

COMMENT  COMPUTE  TOTAL  EQUIPMENT  COSTS 
IF  AMORT  NFO  0$  BEGIN 
FOR  K=(2l,l,60)S 

I012=IOi2+C(K)I 

TOT2=TOT2.AMORT$ 

ENDS 
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COMMENT  COMPUTE  TOTM  COSTS 
FOR  K=l61.T.100)i 

T0T3=T0T3+C(K)$ 

TOTAL=TOTl+TOT2+TOT3$ 

WRiTE  [S$TOTALLINE.*TOTALFORMATI  S 
ANSWER ( I ) = ANSWER) I ) + ( TOTl+TOT P+TOT3 ) ( 1+MINT ) *TS 
SIZE=5IZE+ITEMS$ 

ENDS 

ENDS 

WRITE  (SSHEADERTIS 
for  I=(1,I,N)S 
REG  IN 
PW-n$ 

FOR  K=  n  .-1  .1  )5 
PW=PW+AN5WER(K)$ 

PW=PW. ! 1 / ( 1 tROFR)* I ) S 

AC  =  PW.  (  (ROFRI 1 fROrR)*2 ) /  I  i l  +  ROFR)*I-l )  ) S 
WRITE  ($SCOSTLINE*COSTFORMAT)S 
ENDS 

FINISHS 

compiled  program  FMPS  AT  1?S1 
program  VARIARI  FS  BFGIN  AT  3726 
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APPENDIX  E 


ACCUMULATED  FILE  SIZES  AND  CURRENT  ACCESSION  RATES  OF  THE 
PUBLICATIONS  OF  SEVERAL  ABSTRACTING  AND  INDEXING  SERVICES 

Total  Number  of  Citations  or 
Abstracts  published  in  this  Current  annual 
Indexing  or  Service  from  its  Beginning  publication 

Abstracting  Service  through  1960  rate 


Abstract  Bulletin  of 
the  Inst .  of  Paper 
Chemistry 

Acoustical  Society  of 
America  J. :  References 
Section 

Analytical  Abstracts 
(British) 

Applied  Mechanics 
Reviews 

ASTI A  (Armed  Services 
Technical  Information 
Agency) 

Battelle  Technical 
Review 

Bibliography  of 
Agriculture 

Biological  Abstracts 

Chemical  Abstracts 

Cobalt :  Review  of 
Technical  Literature 
Section 

Current  Abstracts  from 
Gen,  Foods  Corp. 

Research  Center 

Dissertation  Abstracts 

Electronic  Technology 
(reprinted  in  Proc.  IRE, 
Abstracts  &  Ref.  Section) 


78,500 

20,000 

29,796 

53,267 

250,000 

145,295 

1,512,737 

992,032 

2,541,023 

1,673 

35,000 

40,333 

57, 208 


0,500 

2,650 

5,359 

7,200 

35,000 

12,000 

97,200 

100,000 

145,200 

500 

3,000 

7.500 

4.500 


72,331  5,200 

6,000  3,600 


Engineering  Index 
Forestry  Abstracts  (British) 
Ciooriclence  Abstracts 
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Index  Mcdlcus 

IRE-PGEC  Computer 
Abstracts 

Mathematical  Reviews 

Meteorological  and 
Astrogeophysical  Abstracts 

Nuclear  Science  Abstracts 

Prevention  of  Deterioration 
Abstracts 

Psychological  Abstracts 
Review  of  Metal  Literature 
Science  Abstracts  (British) 
Semiconductor  Products 
Solar  Energy 
Technical  Translations 
Tobacco  Abstracts 

U.S.  Government  Research 
Reports 


1,075,039 

140,000 

1,140 

3,600 

127,000 

13,500 

59,700 

10,000 

115,000 

31,000 

19,350 

1,700 

212,055 

0,500 

145,082 

14,000 

441,719 

30,000 

6,000 

1,500 

500 

100 

21,917 

12,000 

0,527 

2,300 

53,292 

24,000 
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INTERVIEW  GUIDE  USED  IN  THIS  STUDY  MD  SUMMARY  OF  RESPONSES 


INTERVIEW  GUIDE 

We  are  conducting  a  study,  under  NSF  sponsorship,  to  develop  methods 
for  evaluating  the  performance  of  document  retrieval  systems.  To  do 
this,  we  have  to  know  the  needs  of  users  of  documents.  So  we  are 
talking  to  some  researchers  in  electronics  in  various  companies  about 
their  own  document  needs. 

Let  me  give  you  definitions  for  two  terms  I'll  be  using  throughout  this 
interview.  (HAND  RESPONDENT  CARP  A  AND  IJ3T  HIM  RE.-ID  WITH  YOU.) 

First,  I  am  concerned  with  document  retrieval  -  that  is,  the 
retrieval  of  entire  documents,  abstracts,  or  citations  of 
documents .  I  am  not  concerned  with  information  retrieval  - 
that  is,  general  information  in  response  to  a  request,  nor 
with  data  retrieval  —  that  is,  the  retrieval  of  specific 
facts . 

Second,  is  the  term  search.  This  is  when  you,  or  someone 
else  at  your  request,  looks  for  references  and/or  documents 
on  a  given  subject.  A  search  can  be  extensive  and  made 
through  one  or  more  libraries,  or  it  can  be  very  brief  - 
such  as  looking  through  sources  you  keep  in  your  own  office. 

Not  included  are  requests  for  specific  documents  (whose 
complete  citation  is  known)  that  you  know  deal  with  the 
subject.  For  example,  you  are  not  searching  when  you  ask 
the  library  to  send  you  a  specific  issue  of  the  IRE  Pro¬ 
ceedings  . 

(TAKE  BACK  CAEID  A) 

1.  Keeping  this  definition  in  mind,  have  you,  or  anyone  requested  by 
you,  conducted  any  searches  in  the  last  year? 

_  Yes  _  No  (IF  NO,  TERMINATE  INTERVIEW) 

(IF  YES,  ASK:) 

2.  Roughly,  how  many?  _ 


/'  z 

3-^ 

6.-/0 

// 

3 
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3a.  Here  is  a  list  oi  some  at;Livlties  EE's  work  in  (HAND  UESPONDENT 

C.ARD  B)  .  In  what,  one  activity  do  you  spend  the  most  working  time? 

3b.  Which  activities  account  lor  the  majiirlty  of  your  searches?  (IF 
RESPONDENT  GIVES  MORE  THAN  THREE,  ASK  FOR  THREE  THAT  ACCOUNT  FOR 
THE  MOST  SEARCHES.) 

3c.  Now  I'd  like  to  ask  you  about  the  most  recent  search  you  did  or 
had  someone  else  do  while  engaged  in  one  of  the  activities  you 
named.  Which  of  the  activities  you  named  required  this  search? 


Q.  3a _ Q.  3b _ Q  .  3c 


One  Activity 
Most  Working 
Time 

Three  Activities 
Majority  of 
Searches 

One  Activity 
Most  Recent 
Search 

a.  General  project 

pi anning 

/JT  Po 

b.  Theoretical  design 

of  experiments 

. f 

J>/ 

y 

n .  Design  cf  equip¬ 

ment^  systems, 
and  procedures 

«5c? 

d.  Conduct  of  lab  ex¬ 

periments  or  field 
tests 

./<£7 

/V 

3 

(1,  Cox'relation  of  ex¬ 

perimental  results 
with  theory,  or 
vice  versa 

f.  Review  &  evalua¬ 

tion  of  a  specific 
pi"ojcct  or  pi'o- 
duel  (a  critique) 

7 

f 

J 

g.  Technical  report 

writing 

3 

h.  Technical  proposal 

writing 

/ 

A? 

C 

i.  Preparation  of 

lectures  or  tech¬ 
nical  papers 

. 

/y 

.1 .  Keeping  current 

with  technical 
advances 

/ 

^3 
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k.  Search  for  novel 

technical  ideas  on 
which  to  base  new 
projects  or  new 
research 

^7 

1 .  Serving  as  a  con¬ 

sultant 

V 

i 

7 

/ 

Other 

/ 

/ 

(TAKE  BACK  CARD  B) 

/cil/  ^  y  ^  f  P'ff 

(~92j 

4.  Do  you  recall  some  of  the  details  of  this  search? 

/^^/,  Yes  —  No  (IF  NO^  SKIP  TO  Q.  20)“^^'^ 

5a.  Do  you  recall  anything  happening  during  the  search  that  made  it 
an  easier  or  hetter  search^  or  that  made  the  search  difficult? 

For  example,  what  was  the  most  dif.?icult  or  irritating  thing 
that  happened?  (PROBE) 

_ X _ 


5b.  What  was  the  easiest  or  most  gratifying  thing  that  happened?  (PROBE) 
_  /y i.  .7? _ _ 


5c.  If  a  young  engineer  who  had  just  joined  the  staff  were  starting 
this  same  search  today,  what  advice  would  you  give  him  to  make 
the  search  easier?  (PROBE) 

_ _ _ _ _ _ 


5d.  What  would  you  v/arn  him  about?  (PROBE) 

_ or 


1115 


6. 


V.>ho  conducted  the  search  -  you^  a  co-v.'orkei’j 
someone  else? 


librarian;  or 


7. 


Self 

Co- worker 
Librarian 

Other  _ 

Computer 


/J? 


/3o  ^/f! 


Do  you  recall  the  exact  nature  of  your  request — that  iS;  did  you 
just  generally  describe  the  subject;  were  certain  terms  used;  or 
what? 

_ _ 


8.  Through  what  library  or  other  offices  was  the  search  conducted? 

_  Company  library  ^  ^ 

_ ASTIA 

_ University  or  col  lege  =3’^ 

_  Other _ ^  % 

4.  ^ 

9.  Which  of  these  statements  most  nearly  describes  how  urgently  you 
needed  the  search  results  v/hen  you  requested  the  search?  Ignore 
the  importance  of  the  results  v/hen  you  received  them  -  we'll  get 
to  that  next.  (HAND  RESPONDENT  CARD  C) 


/V ^  Very  urgent;  other  work  held  up.  E.g.,  a  search  for 
information  on  the  characteristics  of  a  substance  to 
be  used  in  a  current  experiment , 

yS  Important;  needed  to  help  determine  course  of  future 
work  or  to  help  fill  in  gaps  in  your  knowledge.  E.g.; 
a  search  for  information  on  the  performance  of  one  of 
a  class  of  possible  circuits  to  be  used  in  a  piece  of 
equipment . 

^  Not  very  important;  completeness  of  search  results  had 
little  priority.  E.g.;  a  bibliography  to  be  used  as 
/ifZ  supplementary  information. 

■  ^  (TAKE  BACK  CARD  C) 

10.  Sometimes  a  search  turns  up  significant  information  and  sometimes 
it  adds  little  to  the  searcher's  knowledge.  Which  of  these  state¬ 
ments  most  nearly  describes  how  important  the  results  were?  (HAND 
RESPONDENT  C.IRD  D) 


•3^  ‘/n  Very  important.  E.g.;  changed  the  course  of  a  project^ 
provided  key  information  needed  to  obtain  a  contract. 


Not  very  important.  E.g.,  results  were  used  as 
supplementary  or  back-up  material. 
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Unimpol’tant .  E.g.^  results  had  little  or  no  effect 

. -  on  course  of  work. 

% 

liSi)  (TAKE  BACK  CARD  D) 

lla.  Approximately  how  long  was  it  from  the  time  you  made  your  request 
until  you  had  received  the  major  group  of  relevant  references? 

llb.  Was  this  adequate  or  did  you  really  need  the  material  sooner?  (IF 
NEEDED  SOONER,  ASK  HOW  SOON) 

llc.  What  was  the  maximum  amount  of  time  you  could  have  waited  for  the 
major  group  of  relevant  references? 


Q.  11a  Q.  11b _ Q.  11c 


Actual 

Adequate 

Maximum 

1  day  or  less 

2-3  days 

js^nmii 

/S 

/3 

4-13  days 

^.2 

/S' 

2-7  weeks 

cP>/ 

_  /'/ 

2  •"  u  mOnttlo 

o 

,  / 

:-r 

More  than  6  months 

- 

S' 

No  Answer 

_ ^ _ 

u  M 

12a.  Ho’.v  old  were  the  most  recent  references  turned  up  by  the  search? 

In  other  words,  how  recent  was  the  material  covered  by  the  search? 

12b.  Was  this  adequate  or  did  you  really  need  more  recent  material? 

(IF  NEEDED  MOKE  RECENT  MATERIAL,  ASK  HOW  RECENT.) 

12c.  Could  you  have  gotten  by  with  references  that  were  all 

(6  months  or  older,  1  year  or  older,  etc.)?  (START  WITH  CATEGORY 
AFTER  "ADEQUATE"  AND  CONTINUE  UNTIL  RESPONDENT  SAYS  "NO.") 


Q.  12a  Q.  12b _ Q.  12c 


Actual 

....  , 

Adequate  1 

\  Gotten  by?  | 

Yes 

No 

Under  3  months 

Vo 

37  Vo 

— 

3-5  months 

/^ 

/S' 

6-11  months 

/6 

/a 

1  -  2  years 

_^y 

sc 

// 

Over  2  years 

.  /.<’ 

S' 

i  ss 

Over  10  years 

S' 

J?S 

No  Answer 

L 

f 

3L. 

C9-S) 


i3a.  In  what  forms  did  the  I’ecovered  references  come  to  you?  (READ  LIST) 
13b.  Which  of  these  do  you  generally  prefer  for  this  type  of  search? 
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13c.  Which  of  the  others  are  not  preferred  but  generally  adequate? 

13d.  Are  there  any  that  you  consider  inadequate  for  this  type  of  search? 


Ida.  Some  irrelevant  material  is  usually  turned  up  in  a  search.  What 
proportion  of  the  total  time  you  spent  on  this  search  would  you 
guess  was  spent  in  culling  out  irrelevant  or  duplicate  material? 

14b.  Was  that  about. right  or  should  you  have  had  to  spend  less  of  your 
Lime  culling  out  ii relisvant  or  duplicate  material?  (IF  LESS^  A.SK 
WILAT  niCrORTION) 

14c.  Of  the  time  you  spent  on  the  search,  what  is  the  maximum  proportion 
of  your  time  you  would  have  Ijeen  willing  to  spend  culling  out 
irrelevant  material? 


Q.  14a  Q.  14b _ Q.  14c 


Actual 

About  right 

Maximum 

Less  than  1/4 

1/4  but  less  than  1/2 

7 

// 

1/2  but  less  than  3/4 

3/4  or  more 

No  Answer 

_ / 

/ 

15.  (HAND  RESPONDENT  CARD  E  AND  READ  ALONG  WITH  HIM)  I  am  going  to 
show  you  7  cards,  each  of  which  contains  a  statement  about  a  per¬ 
formance  measure  by  which  document  retrieval  systems  can  be  judged. 
It  is  Important  to  realize  that  these  measures  are  to  a  degree  in 
conflict  with  one  another.  For  example,  if  you  want  your  requests 
satisfied  as  quickly  as  possible,  you  normally  must  expect  that 
some  relevant  material  will  be  overlooked.  Similarly,  if  you  waul 
the  system  to  produce  all  or  nearly  all  the  relevant  documents, 
then  you  must  expect  a  large  number  of  irrelevant  documents  in 
the  results.  (HAND  RESPONDENT  GROUP  OF  CARDS) 
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Please  put  these  items  in  the  order  in  which  you  would  least  want 
to  compromise  on  the  type  of  search  we’ve  been  discussing.  Put 
those  you  feel  strongly  you  wouldn't  want  to  compromise  on  your 
left,  those  you  wouldn't  mind  compromising  on  your  right,  and  the 
others  in  the  middle.  Now,  put  those  in  each  group  in  order.  If 
you  feel  two  items  are  equal  in  importance,  put  them  together. 


Minimum  time  to  get  the  major  group  of  relevant  refei*- 
ences  to  you. 

Minimum  of  irrelevant  material  produced  by  the  search 
Minimum  of  relevant  material  overlooked  by  the  search 
References  come  to  you  in  form  you  prefer  (complete 
document,  abstract,  citation,  or  document  number) 
Assurance  that  documents  on  a  given  subject  do  not  exist 
Minimum  of  effort  on  your  part  to  communicate  your  re¬ 
quest  for  a  search 

Certainty  that  specified  sources  over  certain  period  of 
time  were  searched  (certain  that  100  percent  of  the 
sources  were  searched,  certain  that  90%  were  searched 
out  i0%  maj'  not  have  been  searched,  etc.) 

(AFTER  RECORDING,  TAKE  BACK  CARD  E  AND  GROUP  OF  CARDS.) 

16a.  On  the  type  of  search  we've  been  discussing,  how  long  from  the 
time  you  make  your  request  can  you  generally  wait  for  a  search 
which  covers  50%  of  the  potential  sources? 

16b.  How  long  for  a  search  covering  80%.? 

16c.  How  long  for  a  search  covering  all  or  almost  all  potential  sources? 
q.  16a  50% 

Q.  16b  80% _ _ 

Q.  16c  Almost  all  _ 

17a,  Again  on  the  tj^pe  of  search  we've  been  discussing,  how  many  of 
your  own  working  days,  weeks,  or  months  would  you  be  willing  to 
spend  on  the  search  if  you  could  be  sure  50%.  of  the  relevant 
sources  were  located? 

17b.  How  much  if  80%  of  the  relevant  sources  were  located? 

17c.  And  if  almost  all  were  located? 


Q. 

17a 

50% 

Q. 

17b 

80% 

Q. 

17r. 

Almost  all 

Order 


F-Ji 


b. 

c. 

d. 

e . 

f . 
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18a.  Let's  assume  t'or  a  moment  that  you  initiated  a  search  oi  the  type 
we've  been  discussing.  Let's  say  that  you  personally  have  spent 
X  amount  of  time  on  the  search  and  that  the  search  covered  sources 
up  through  2  years  ago  but  nothing  more  recent ,  Proportionately 
how  much  more  working  time  would  you  personally  be  willing  to 
spend  to  see  that  sources  up  through  1  year  ago  were  covered? 
(OBTAIN  ANSWERS  IN  MULTIPLES  OP  ”x"  -  "Half  again  as  much  time," 
"Twice  as  much,"  etc.) 

18b.  How  much  to  see  that  sources  up  through  6  months  ago  were  located? 

18c.  And  sources  up  througn  1  month  ago?  ^ 

Q.  ISa  Up  through  1  year  ago _ _ 

Q.  18b  Up  through  6  months  ago  _ 

Q.  18c  Up  through  1  month  ago  _ 


19a.  And  now  a  general  question  about  your  needs  for  coverage  -  that  is, 
the  numbex’  of  sources  and  period  of  time  covered  -  for  all  the  kinds 
of  searches  you  have  done  in  the  past  few  years.  How  often  could 
you  have  used  these  types  of  searches,  ignoring  the  fact  that  you 
may  have  been  unable  to  do  these  searches  with  current  tools? 

(.HAND  KKSPONUENT  CARD  F) 


Once  in  No  __ 

Often  Awhile  Never  Answer 


The  contents  of  15  or  less 
journal  a  of  special  interest 
to  you 

(A 

U 

The  contents  of  all  the  jour¬ 
nals  covered  by  the  major 
Indexing  &  abstracting  ser¬ 
vices  in  your  field 

/ 

o2 

CQ 

<D 

lO 

0 

+j  -H 

The  contents  of  all  the  U.S. 
scientific  &  technical 
journals 

_ ^ 

the  las 
lublicat 

The  contents  of  all  English 
speaking  scientific  and 
technical  journals 

('9z) 

For 
of  p 

The  contents  of  all  the 
woi’ld' s  scientific  &  technical 
journals 

-4Z- 

(TAKE  BACK  CARD  F) 
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19b.  Would  youi’  answers  differ  if  you  weren't  limited  to  searching  the 

last  5  years  of  publication?  (IF  YES,  ASK  HOW  ANSWERS  WOULD  DIFFER) 


And  now  a  few  background  questions. 


20 .  Name _ _ 

21.  Company _ _ 

22.  What  is  your  job  title?  _ 

23.  Would  you  classify  yourself  as  a  research  manager,  a  senior  engineer, 
an  engineer,  or  a  junior  engineer? 

Research  manager 
Senior  engineer 
Engineer 
Junior  engineer 
No  answer 


24.  In  a  general  technical  sense,  what  do  you  consider  to  be  your 

specialty  field?  For  example,  computer  design,  microwave  circuit 
and  techniques,  etc. 


j  ^  ^7^ 


C9a? 


25.  What  is  the  highest  academic  degree  you  hold  and  what  year  was  it 
conferred? 


it 


Degree 

BSEE 
MSEE 
Engineer 
PhD,  ScD 

Other  _ 

No  Answer 


Year  conferred 


26,  Are  you  a  member  of  IRE  or  of  AIEE? 
do  you  hold? 


<rfA 


If  so,  what  type  of  membership 


124 


IRE 


AIEE 


^7 %  Not  a  member 
—  Fellow 
/ 0  Sr.  member 

-  .  t 

Member 

Associate 

/ _  Student 

C  /  C^r 


Not  a  member 

- _  Fellow 

^  Member 

Associate 
/  No  Answer 


/*-f  % 

^92.J 


27 .  How  many  years  of  working  engineering  experience  have  j'ou  had  in 
these  types  of  organizations?  (READ  LIST) 

Y^ars 

yo ' 


University 
Research  Institute 
Industry 

Government  Labs  or  Offices 
Independent  Consulting 


TOTAL 


y/v^ 

^7-s.) 


2S.  Have  you  authored  any  publications  or  given  any  technical  papers 
In  the  last  three  years?  If  so^  how  many  technical  articles  or 
papers?  Any  books?  Anything  else? 


29. 


1  /C-r"  0 

Into  which 


None 

Technical 

Books 

Other 


articles  or  technical  papers 


i(/7i 


of  the  following  age  groups  do 


''  ''V 

^  ySL 

/S/Vic  (•  y^/y*>x-yu-.  9' 

y^ir  ~ 

you  fall?  (READ  LIST) 


/  0 

_ 

-M— 

_«£_ 

/c^r  0 


Under  25 
25  to  29 
30  to  34 
35  to  39 
40  to  44 
45  and  over 


Date  _ _ _ 

Length  of  Interview  minutes 
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Tafcle  F-1 


resuJjTS  of  the  survey  question  regarding  the 

SPECIFICATION  OF  THE  SEARCH 

Question  7,  Do  you  recall  the  exact  nature  of  your  request — that  is, 
did  you  just  generally  describe  the  subject,  were  cer¬ 
tain  terms  used,  or  what? 

Generally  described  pi-oblem,  general  subject  23% 

Several  broad  headings  13 

Fairly  specific  15 

Specific  terms,  key  words  46 

Other  3 

No  answer 

Total  100% 

Base  (92) 
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Table  F-2 

RESTILTS  OF  THE  SURVEY  QUESTION  REGARDING  THE 
RELATIVE  RANKING  OF  THE  REQUIREMENTS 

Question  15.  Please  put  these  items  in  the  order  in  which  you  would 

least  want  to  compromise.  (Note:  This  is  an  abbreviated 
form  of  the  question.) 


Factors— 


Rank 

a 

b 

c 

d 

e 

f 

g 

(Most  important) 

1 

36% 

4% 

20% 

10% 

r\of 

O  to 

9% 

11% 

1.5 

2 

2 

- 

1 

1 

2 

- 

2 

Ifl 

6 

17 

17 

19 

5 

13 

2.5 

- 

- 

1 

- 

2 

- 

1 

3 

15 

0 

24 

14 

12 

4 

17 

3.5 

-- 

- 

1 

- 

1 

- 

- 

4 

22 

10 

13 

15 

12 

16 

16 

4.5 

- 

- 

- 

- 

- 

- 

- 

5 

3 

15 

13 

17 

10 

14 

17 

5.5 

- 

- 

- 

- 

- 

- 

- 

6 

2 

33 

8 

10 

12 

39 

14 

6.5 

- 

~ 

1 

1 

2 

- 

2 

(Least  Important) 

7 

2 

22 

2 

15 

16 

31 

9 

No  answer 

-- 

- 

- 

- 

- 

- 

- 

Total 

100% 

100% 

100% 

100% 

100% 

100% 

100% 

1/  The  factors 

Base  (92) 

were  as  follows: 

(92) 

(92) 

(92) 

(92) 

(92) 

(92) 

a.  Minimum  time  to  get  the  major  group  of  relevant  references 
to  you 

b.  Minimum  of  irrelevant  material  produced  by  the  search 

c.  Minimum  of  relevant  material  overlooked  by  the  search 

d.  References  come  to  you  in  form  you  prefer  (complete  document^ 
abstract,  citation,  or  document  number) 

e.  As.surance  that  documents  on  a  given  subject  do  not  exist 

f.  Minimum  of  effort  on  your  part  to  communicate  your  request 
for  a  search 

g.  Certainty  that  specified  sources  over  certain  period  of  time 
were  searched  (certain  that  100  percent  of  the  sources  were 
searched,  certain  that  90%  were  .searched  but  10%  may  not  have 
been  searched,  etc.) 
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Table  F-3 


RESULTS  OF  THE  SURVEY  QUESTION  REGARDING  THE 
TOLERABLE  DELAY  IN  OBTAINING  THE  SEARCH  PRODUCT 

Question  16a.  On  the  type  of  search  we've  been  discussing^  how  long 
from  the  time  you  make  your  request  can  you  genei^ally 
watt  for  a  search  which  covers  50%  of  the  potent.  ,1 
sources? 

Question  16b.  How  long  for  a  search  covering  00%? 

Question  16c.  How  long  for  a  search  covering  all  or  almost  all 


potential  sources? 

50%  of 

C0%  of 

Almost  all 

Sources 

Sources 

Sources 

3  days  or  loss 

OROf 

apiw/o 

nor 
o  /o 

nnt 

^/o 

4-7  days 

24 

19 

0 

C  -  13  days 

4 

5 

0 

2-3  weeks 

30 

33 

27 

4-7  weeks 

14 

27 

24 

2-3  months 

2 

11 

22 

More  than  3  months 

— 

1 

9 

No  answer 

__ 

1 

3 

Total 

100% 

100% 

100% 

Base 

(92) 

(92) 

(92) 
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Table  F“4 


RESULTS  OF  THE  SURVEY  QUESTION  REGARDING  THE 
TOLERABLE  EFFORT  TO  lOCATE  RELEVANT  MATERIAL 

Question  17a.  Again  on  the  type  of  search  we've  been  discussing,  how 
many  of  your  own  working  days,  weeks,  or  months  would 
you  be  willing  to  spend  on  the  search  if  you  could  be 
sure  50%  of  the  relevant  sources  were  located? 

Question  17b,  How  much  if  00%,  of  the  relevant  sources  were  located? 

Question  17c.  And  if  almost  all  were  located? 


50%  of 

80%  of 

Almost  All 

Relevant 

Relevant 

Relevant 

Sources 

Sources 

1  day  or  less 

37% 

22% 

21% 

2-4  days 

20 

36 

23 

1  week,  but  less 
than  2 

23 

15 

21 

2  weeks,  but  less 
than  3 

2 

10 

14 

3  weeks  or  more 

2 

10 

13 

No  answer 

C 

7 

0 

Total 

100% 

100% 

100% 

Base 

(92) 

(92) 

(92) 
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Table  F-5 


RESULT  OF  THE  SURVEY  QUESTION  IlEGARDING  THE 
TOIERABLE  EFFORT  TO  OBTAIN  RECENT  MATERIAL 

Question  Ida,  Let's  assume  for  a  moment  that  you  initiated  a  search  of 
the  type  we've  been  discussing.  Let's  say  that  you  per¬ 
sonally  have  spent  X  amount  of  time  on  the  search  and 
that  the  search  covered  sources  up  through  2  years  ago 
but  nothing  more  recent.  Proportionately  how  much  more 
working  time  would  you  personally  be  willing  to  spend 
to  see  that  sources  up  through  1  year  ago  were  covered? 


Question  10b.  How  much  to  see 
were  located? 

that  sources 

up  through 

6  months 

Question  10c.  And  sources  up 

through  1  month  ago? 

Through 

Through 

Through 

1  Vs&i* 

6  Months 

^  Mnnt.h 

Ago 

Ago 

Ago 

1/2  X  or  less 

30% 

18% 

15% 

More  than  1/2  X  -  1  X 

23 

30 

24 

2  X  -  d  X 

25 

29 

36 

5  or  more  X 

11 

20 

22 

No  answer 

3 

3 

3 

Toi'.al 

100% 

100% 

100% 

Base 

(92) 

(92) 

(92) 

Note:  All  data  arc  in  terms  of  effort  to  update  from  2  years  ago. 

Thus  the  data  for  6  months  indicate  effort  to  update  from  2 
years  to  6  months  ago,  and  data  for  1  month  ago  indicate  effort 
to  update  from  2  years  to  1  month  ago. 
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Table  F-6 


SPECIALTY  FIELDS  OF  THE  INTERVIEWEES 

Question  24.  In  a  general  technical  sense,  what  do  you  consider  to 
be  your  specialty  field?  For  example,  computer  design, 
microwave  circuit  and  techniques,  etc. 

Circuits  and  devices  (primarily 


digital  techniques)  43% 

Microwave  and  communication 

engineering  21 

Antennas  and  propagation  10 

Communication  theory  7 

Other  19 

No  answer 

Total  100% 

Base  (92) 


REFERENiDES 


1. 

2. 

3. 

4. 

5. 

6. 

7. 

0. 

y . 

10. 

11. 

12. 


John  C.  Flanagan,  "The  Critical  Incident  Technique,"  Psychologi 
Bulletin  51,  No.  4,  pp.  327-350  (July  1954). 

H.  Bernstein,  "A  Paradigm  for  a  Retrlex'al  Effectiveness  Experir?"^ 
American  Documentation  12,  No.  4_  ,,’p  (O'-tober  1961). 

C.  Cleverdon,  "The  Evaluation  of  Systems  Used  in  Information 
Retrieval,"  Sec.  4,  Proc.  of  the  International  Conference  on 
Scientific  Information,  National  Academy  of  Science,  Washiiigton, 

D. C.  (1950). 

C.  Cleverdon,  "interim  Report  of  the  ASLIB  Cranfield  Research 
Project,"  College  of  Aeronautics,  Cranfield,  England  (November  IS 

C.  N.  Mooers,  "The  Intensive  Sample  Test  for  the  Objective  Evai  laj 
of  the  Performance  of  Infomation  Retrieval  Systems,"  Repoi't  ZTti-' 
Zator  Company,  Cambridge,  Massachusetts  (August  1959)  . 

D.  R.  Swanson,  "Searching  Natural  '.janguage  Text  by  Compute?.’,"  ' 
Science  130,  pp .  1099-1104  (1960). 

M.  G.  Kendall,  Rank  Correlation  Methods  (Hafner  Publishing 
New  York,  1940,  2d.  ed.  1955)  i 

C.  P.  Eoui’ne,  "The  World's  Technice.l  Journal  Literature:  An 
Estimate  of  Volume,  Origin,  Language,  Field,  Indexing,  and  Abst?,. 
ing,"  Internal  Research  Memorandum,  Stanford  Research  Institute, 
Menlo  Park,  California  (August  1961)  . 

C.  P.  Bourne,  "The  Organization  of  a  Memory  System  for  Informfili  ^ 
Retrieval  Applications,"  Supplement  A  to  Quarterly  Report  2,  ; 

Contract  AF  30(602)-2142,  Stanford  Research  Institute,  Menlo  Pury 
California  (June  1960)  , 

E.  L.  Grant,  W.  G.  Ireson,  Principles  of  Engineering  Economy,  4v' 
ed.  (The  Ronald  Press,  New  York,  19r?0) , 

M.  W.  Mueller,  "Time,  Cost  and  Value  Factors  in  Infoj.’mation  Relic 
General  Info.rmation  Manual;  Iniorm.';i t ion  Retrieval  Syst cti .s  Coi.fit 
ence,  Brochure  E20-S040,  internati  .:,al  Business  Machines,  Whi  te 
Plains,  New  York  (1960). 

M.  G.  Kendal],  op.  clt.,  Appendix  Table  6,  p.  106. 


